freedom

This Is Not the AI We Were Promised

Scientific reasons why uncritical LLM adoption in government is unsafe Michael Wooldridge’s Royal Society lecture makes a crucial point for public policy: today’s large language models are not “reasoning minds” but probabilistic next-token predictors. They generate fluent text without an internal notion of truth, accountability, or epistemic humility. This design reality matters most in the … Read more

Artificial Intelligence and the Public Interest

Scientific Arguments Against Uncritical Deployment in the Public Sector Artificial Intelligence is frequently presented as a neutral instrument of modernization within public administration. Claims of efficiency and cost reduction dominate policy discourse. Yet a growing body of scientific research demonstrates that uncritical deployment of AI systems in public institutions poses structural risks to democratic governance, … Read more

Open Source AI Beyond Scale

European low cost, open source local LLMs as a strategic alternative The global AI narrative remains focused on ever larger language models, demanding massive computational resources and reinforcing dependence on a handful of providers. As critics such as Gary Marcus have argued, this path leads to diminishing returns without resolving fundamental issues of reasoning and … Read more

From German Commons to Greek Commons

A policy case for Greek as a national and European language data infrastructure Large language models depend on vast amounts of text, but scale without legal clarity produces fragile systems. Datasets built on opaque web crawling cannot guarantee lawful reuse, redistribution, or long-term sustainability. The German Commons provides a clear alternative: 154.56 billion tokens of … Read more

Epistemological fault lines between human and artificial intelligence

When linguistic plausibility replaces judgment and why this is a governance issue Large language models are widely described as artificial intelligence because their outputs resemble human reasoning. This resemblance, however, is largely superficial. As argued by Quattrociocchi, Capraro, and Perc, LLMs do not form beliefs about the world. They are stochastic pattern completion systems that … Read more

Language corpora and Text Encoding Initiative(TEI)

Open standards for documented linguistic knowledge Language corpora have become a foundational infrastructure for linguistics, natural language processing, and contemporary artificial intelligence. The term corpus does not merely denote a collection of texts but implies deliberate selection, structuring, and documentation according to explicit design criteria. Within this context, the Text Encoding Initiative Guidelines provide a … Read more

Synthetic Data, Real Risks: Why AI Must Be Trained on High-Quality Open Data

A seductive solution with hidden dangers Synthetic data is often presented as a clever fix for three persistent challenges in machine learning: data scarcity, unfair training distributions and privacy restrictions. At the same time, some argue it could democratise AI development by reducing dependence on large proprietary datasets held by a few dominant companies. But … Read more

Apertus AI: A Fully Open Multilingual LLM for Local and Customized Deployment

Apertus AI is one of the most transparent and technically mature efforts to build a fully open-source large language model. Developed in Switzerland and released together with its source code, training documentation and model weights, it offers an unprecedented level of reproducibility and independence from closed ecosystems. This makes it ideal for researchers, public-sector institutions … Read more

Building a Fully Open Greek LLM: A Three-Millennia Language Model Powered by Open Data Infrastructure

The rapid evolution of fully open large language models represents a transformative moment for countries that possess rich linguistic and cultural heritage. Over the past two years, the global AI community has shown that high-performance LLMs can be built openly, with transparent pipelines, published datasets and weights, and licenses that support both research and commercial … Read more

New GÉANT Community Committee announced during the GCP workshop

CESNET very kindly hosted the GÉANT Community Programme (GCP) workshop on 14-15 May at their office in Prague. This is the second time members of the GCP, including the GÉANT Community Committee (GCC), representatives of GCP initiatives, coordinators of the Special Interest Groups (SIGs) and Task Forces (TFs), and SIG Steering Committee members came together … Read more