
Scientific reasons why uncritical LLM adoption in government is unsafe
Michael Wooldridge’s Royal Society lecture makes a crucial point for public policy: today’s large language models are not “reasoning minds” but probabilistic next-token predictors. They generate fluent text without an internal notion of truth, accountability, or epistemic humility. This design reality matters most in the public sector, where decisions must be reasoned, contestable, and attributable to responsible officials and institutions. (Royal Society)
Reliability is the bottleneck, not speed
LLMs can produce confident, well-formed statements that are false, incomplete, or fabricated, including invented citations. This is not a minor bug but a predictable outcome of statistical pattern completion under uncertainty and distribution shift. Risk management literature emphasizes that system “performance” cannot be reduced to benchmark scores; real-world deployments require context-sensitive evaluation, monitoring, and clear failure-mode handling. The NIST AI RMF frames these issues as governance problems: transparency, traceability, and measurable trustworthiness are prerequisites for high-stakes use. (NIST Publications)
Institutional harm through accountability drift
When AI moves from a helper to an infrastructural component in administrative workflows, it changes institutional behavior. Hartzog and Silbey argue that AI systems can erode institutions by diffusing responsibility across vendors, model builders, and operators, weakening the link between authority and answerability. In government, “black box” outputs collide with the rule-of-law requirement that administrative acts be justified and reviewable. This is precisely why the EU AI Act treats many public-sector uses as high-risk and imposes obligations around risk management, transparency, and human oversight. (Royal Society)
Human oversight is not a magic fix
A common policy response is “keep a human in the loop.” Empirical work on human–AI interaction in public sector decision-making shows that people often over-rely on algorithmic advice, even when warning signals exist. Automation bias can turn nominal oversight into rubber-stamping, especially under workload pressure, institutional incentives, or perceived machine authority. This undermines the assumption that a human reviewer reliably corrects model errors. (OUP Academic)
Security risks grow with agents and RAG
As governments connect LLMs to internal knowledge bases and tool-using agents, they expand the attack surface. Prompt injection and retrieval poisoning can manipulate outputs or actions by embedding malicious instructions in retrieved documents or web content. Research on RAG-enabled systems documents systematic vulnerabilities and the need for layered defenses, but also shows that residual risk can remain significant in practice. For public services handling sensitive registries and legal documents, these risks are not abstract; they are operational security concerns. (arXiv)
What a realistic AI policy looks like
A science-aligned approach is to treat LLMs as drafting and summarization aids, not decision-makers. Require documented use-cases, mandatory verification, and explicit human sign-off for any official output. Restrict autonomy as institutional risk increases. Align procurement and deployment with risk frameworks and binding rules for high-risk systems.
Finally, digital sovereignty matters: whenever feasible, prefer auditable European stacks and on-prem or sovereign-cloud deployments, leveraging open-weight models such as Mistral Large 3 (Mistral AI) and transparency-oriented European initiatives like OpenEuroLLM (OpenEuroLLM), complemented by European models designed for multilingual coverage and local deployment (Apertus, EuroLLM, Velvet). (Hugging Face)
Key sources for this article:
1. Royal Society, “This is not the AI we were promised” (Faraday Prize Lecture, Feb 2026): framing LLMs as statistical predictors (Royal Society)
2. EU AI Act, Regulation (EU) 2024/1689 (EUR-Lex): obligations for high-risk AI systems (EUR-Lex)
3. NIST AI RMF 1.0 (NIST AI 100-1): governance and trustworthiness risk framework (NIST Publications)
4. Hartzog & Silbey, “How AI Destroys Institutions” (SSRN): accountability drift and institutional erosion (Royal Society)
5. Alon-Barkat et al., “Human–AI Interactions in Public Sector Decision Making” (JPART, OUP): automation bias and overreliance (OUP Academic)
6. Ramakrishnan et al., “Securing AI Agents Against Prompt Injection Attacks” (arXiv): prompt injection risks in RAG agents (arXiv)
7. Clop et al., “Backdoored Retrievers for Prompt Injection Attacks on RAG” (arXiv): retriever backdoors and poisoning (arXiv).
—
