freedom

Open Source AI Beyond Scale

European low cost, open source local LLMs as a strategic alternative The global AI narrative remains focused on ever larger language models, demanding massive computational resources and reinforcing dependence on a handful of providers. As critics such as Gary Marcus have argued, this path leads to diminishing returns without resolving fundamental issues of reasoning and … Read more

From German Commons to Greek Commons

A policy case for Greek as a national and European language data infrastructure Large language models depend on vast amounts of text, but scale without legal clarity produces fragile systems. Datasets built on opaque web crawling cannot guarantee lawful reuse, redistribution, or long-term sustainability. The German Commons provides a clear alternative: 154.56 billion tokens of … Read more

Epistemological fault lines between human and artificial intelligence

When linguistic plausibility replaces judgment and why this is a governance issue Large language models are widely described as artificial intelligence because their outputs resemble human reasoning. This resemblance, however, is largely superficial. As argued by Quattrociocchi, Capraro, and Perc, LLMs do not form beliefs about the world. They are stochastic pattern completion systems that … Read more

Language corpora and Text Encoding Initiative(TEI)

Open standards for documented linguistic knowledge Language corpora have become a foundational infrastructure for linguistics, natural language processing, and contemporary artificial intelligence. The term corpus does not merely denote a collection of texts but implies deliberate selection, structuring, and documentation according to explicit design criteria. Within this context, the Text Encoding Initiative Guidelines provide a … Read more

Synthetic Data, Real Risks: Why AI Must Be Trained on High-Quality Open Data

A seductive solution with hidden dangers Synthetic data is often presented as a clever fix for three persistent challenges in machine learning: data scarcity, unfair training distributions and privacy restrictions. At the same time, some argue it could democratise AI development by reducing dependence on large proprietary datasets held by a few dominant companies. But … Read more

Apertus AI: A Fully Open Multilingual LLM for Local and Customized Deployment

Apertus AI is one of the most transparent and technically mature efforts to build a fully open-source large language model. Developed in Switzerland and released together with its source code, training documentation and model weights, it offers an unprecedented level of reproducibility and independence from closed ecosystems. This makes it ideal for researchers, public-sector institutions … Read more

Building a Fully Open Greek LLM: A Three-Millennia Language Model Powered by Open Data Infrastructure

The rapid evolution of fully open large language models represents a transformative moment for countries that possess rich linguistic and cultural heritage. Over the past two years, the global AI community has shown that high-performance LLMs can be built openly, with transparent pipelines, published datasets and weights, and licenses that support both research and commercial … Read more

New GÉANT Community Committee announced during the GCP workshop

CESNET very kindly hosted the GÉANT Community Programme (GCP) workshop on 14-15 May at their office in Prague. This is the second time members of the GCP, including the GÉANT Community Committee (GCC), representatives of GCP initiatives, coordinators of the Special Interest Groups (SIGs) and Task Forces (TFs), and SIG Steering Committee members came together … Read more

We are opening the WeatherXM cell modelling to the world

Our mission at WeatherXM is to create the largest decentralised weather station network. A significant challenge in this direction is how our weather stations should be distributed around the world. The challenging part has to do with Earth’s surface, which is not uniform (a good thing in so many other aspects of our life!). Each … Read more