From German Commons to Greek Commons
A policy case for Greek as a national and European language data infrastructure Large language models depend on vast amounts of text, but scale without legal clarity produces fragile systems. Datasets built on opaque web crawling cannot guarantee lawful reuse, redistribution, or long-term sustainability. The German Commons provides a clear alternative: 154.56 billion tokens of … Read more








