Released September 2, 2025 by the Swiss AI Initiative — EPFL, ETH Zürich, and CSCS. Two models at 8B and 70B parameters. 15 trillion training tokens. 1,811 natively supported languages. 40% non-English training data. Apache 2.0 license. Trained on up to 4,096 GPUs on the Alps supercomputer. Retroactive robots.txt opt-out compliance — January 2025 opt-out preferences applied to web scrapes from prior crawls. Goldfish loss to prevent verbatim memorization. xIELU activation function, AdEMAMix optimizer, QRPO alignment. Independent benchmarks place Apertus-8B at MMLU-Pro 31.14%. Federal-research-institution model — Switzerland is outside the EU but inside the European regulatory sphere. The architectural template the European sovereign-AI movement has been waiting for.

By Thorsten Meyer — May 2026

This is the sixth standalone essay in the European sovereign-LLM track. Prior essays documented five distinct institutional answers: AMÁLIA (Portuguese national continuation), Minerva (Italian national from-scratch), OpenEuroLLM (pan-European consortium), Mistral (French commercial-frontier), and Aleph Alpha (German enterprise-sovereignty pivot · the retrospective case). This piece documents a sixth structurally distinct answer: the federal-research-institution model anchored in Switzerland — outside the EU geographically but inside the European regulatory sphere through EU AI Act alignment and Swiss data protection law compliance.

Apertus is structurally distinct from the prior five in five material ways. It is the only project of the six that: (1) commits to true open data rather than just open weights — the entire training corpus is publicly documented and reproducible, (2) implements retroactive opt-out compliance — applying January 2025 robots.txt opt-out preferences to web scrapes from prior crawls, (3) supports 1,811 natively trained languages — substantially more than any other project of the six, (4) operates as a federal-research-institution model rather than national, commercial, consortium, or pivot, and (5) is anchored in Switzerland — geographically outside the EU but operationally aligned with European regulatory framework. The combination is novel.

The structural argument I want to make in this piece: Apertus is the architectural template the European sovereign-AI movement has been waiting for. It demonstrates operationally that the strategic-positioning recommendation from Essays 04-05 (Position 2 + Position 4 — sovereignty/openness/compliance + vertical specialization) is buildable from first principles when designed correctly from inception. The retroactive opt-out compliance is the single most important technical-policy innovation in any of the six projects examined. The 1,811-language coverage operationalizes “inclusive AI” at a scale no commercial model attempts. The federal-research-institution model demonstrates that institutional structure outside venture capital, consortium, and commercial framework is viable for European sovereign-AI infrastructure.

The structural complication that this piece must surface honestly: Apertus is still operating at the same capability ceiling as the prior five projects. Apertus-8B-Instruct scored 31.14% on MMLU-Pro per the DS-NLP Lab independent February 2026 evaluation — strong performance for a fully-open compliance-first 8B model but well below frontier-class commercial models. The architectural rigor, multilingual coverage, and compliance framework do not eliminate the structural capability gap with US frontier developers that the four-way comparison from Essay 04 documented. The Apertus case demonstrates that the structural ceiling is real even when the project is designed from first principles for European sovereign-AI requirements.

This piece walks the Apertus project forensically, surfaces what the six-way comparison now contains, situates Apertus as the architectural reference template, and closes with what the European sovereign-AI movement should integrate from the six accumulated cases. The standard caveat applies: Apertus is recent (September 2, 2025 launch · DS-NLP independent benchmarks February 2026 · Canton of Ticino deployment March 2026), and the project is committed to regular updates. The strategic assessment may shift as domain-specific versions for law, climate, health, and education ship.

Apertus · The Architectural Template.
DISPATCH / MAY 2026 ESSAY · EUROPEAN SOVEREIGN LLMs · APERTUS · ARCHITECTURAL TEMPLATE
▲ Standalone Essay EU Sovereign AI · Switzerland · May 2026
Standalone Essay 06 · European Sovereign AI · The Federal-Research-Institution Case Study

Apertus.
The architectural
template.

EPFL, ETH Zürich, and CSCS. 1,811 languages. 15 trillion training tokens. 4,096 GPUs on the Alps supercomputer. Retroactive robots.txt opt-out compliance. Goldfish loss to prevent verbatim memorization. The blueprint the European sovereign-AI movement has been waiting for.

Apertus is structurally distinct from the prior five essays in this track in five material ways. It is the only project of the six that commits to true open data rather than just open weights, implements retroactive opt-out compliance (applying January 2025 robots.txt opt-out preferences to web scrapes from prior crawls), supports 1,811 natively trained languages, operates as a federal-research-institution model rather than national, commercial, consortium, or pivot, and is anchored in Switzerland — outside the EU but inside the European regulatory sphere. The Canton of Ticino migration from Mixtral to Apertus in March 2026 is the operational validation. The work is real. The architectural template is real. The structural ceiling is real. All of these can be true at once.

▲ The structural editorial finding · the architectural template
Apertus is the architectural reference template the European sovereign-AI movement has been waiting for. The retroactive opt-out compliance is the single most important technical-policy innovation in any of the six projects examined. Compliance can be architectural, not policy-layer. The federal-research-institution model produces structurally distinct outputs: true open data, public-good infrastructure, regular updates, long-term commitment to open, trustworthy, and sovereign AI foundations.
— standalone essay 06 · the Apertus case · may 2026 · the architectural template
1,811
Languages natively supported · 40% non-English training data · Swiss German + Romansh included
Multilingual-first by design · serves underrepresented languages no commercial frontier developer attempts
4,096
Up to GPUs on Alps supercomputer at CSCS Lugano · 10M+ GPU hours invested
Apertus-70B is the first fully open model trained at this scale · 15T tokens · order-of-magnitude comparable to Mistral Large 3
Sep2025
Released September 2, 2025 · EPFL + ETH Zürich + CSCS · Apache 2.0 · both 8B and 70B
Public AI international deployment with 115,000+ GPU-hours across 20 clusters in 5+ countries (Sep alone)
31.1%
Apertus-8B MMLU-Pro · DS-NLP Lab independent Feb 2026 evaluation · the structural complication
Below frontier-class · the structural ceiling is real even when architecture is designed from first principles
APERTUS RELEASED SEP 2, 2025 · EPFL + ETH ZÜRICH + CSCS · SWISS AI INITIATIVE · APACHE 2.0 · 8B AND 70B SIZES ARCHITECTURE 15T TOKENS · xIELU ACTIVATION · ADEMAMIX OPTIMIZER · QRPO ALIGNMENT · GOLDFISH LOSS · QK-NORM · UP TO 4,096 GPUs MULTILINGUAL 1,811 LANGUAGES NATIVELY SUPPORTED · 40% NON-ENGLISH · SWISS GERMAN + ROMANSH · 65K CONTEXT RETROACTIVE OPT-OUT JANUARY 2025 ROBOTS.TXT OPT-OUT PREFERENCES APPLIED TO PRIOR WEB CRAWLS · NO COMMERCIAL MODEL DOES THIS DEPLOYMENT SWISSCOM SOVEREIGN PLATFORM · HUGGING FACE · PUBLIC AI 115,000 GPU-HRS / 20 CLUSTERS / 5+ COUNTRIES TICINO MIGRATION CANTON DELIBERATELY MIGRATED FROM MIXTRAL TO APERTUS IN MARCH 2026 · SOVEREIGNTY + ETHICAL TRAINING DATA FUTURE DOMAIN-SPECIFIC VERSIONS PLANNED · LAW · CLIMATE · HEALTH · EDUCATION · REGULAR UPDATES FROM CSCS + ETH + EPFL
The founding-principle statements · architectural reference template

Four statements. One blueprint.

The Swiss AI Initiative leadership team articulates the strategic positioning explicitly. “Blueprint” (Jaggi). “Public good” (Schlag). “Not a conventional case of technology transfer” (Schulthess). “Long-term commitment to open, trustworthy, and sovereign AI foundations” (Bosselut). The deliberate language positions Apertus as architectural reference template, not commercial product.

Swiss AI Initiative leadership · September 2, 2025 launch statements
From the ETH Zürich press release. Four statements from the four project leads crystallize the federal-research-institution positioning. The framing positions Apertus as architectural reference template, not commercial product.
Imanol Schlag
Apertus Technical Lead · ETH Zürich
Apertus is built for the public good. It stands among the few fully open LLMs at this scale and is the first of its kind to embody multilingualism, transparency, and compliance as foundational design principles.
Martin Jaggi
Professor of ML · EPFL · Steering Committee
With this release, we aim to provide a blueprint for how a trustworthy, sovereign, and inclusive AI model can be developed.
Thomas Schulthess
Director · CSCS · Professor · ETH Zürich
Apertus is not a conventional case of technology transfer from research to product. Instead, we see it as a driver of innovation and a means of strengthening AI expertise across research, society and industry.
Antoine Bosselut
Professor · EPFL · NLP Laboratory · Co-Lead
The beginning of a journey, a long-term commitment to open, trustworthy, and sovereign AI foundations.
The compliance architecture · the single most important technical-policy contribution
Amazon

multilingual AI training dataset

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Compliance. Architectural, not policy-layer.

The Apertus retroactive opt-out + Goldfish loss + memorization avoidance framework demonstrates that EU AI Act compliance can be implemented at the training-architecture level rather than as policy-and-content-moderation overlay. No commercial AI lab implements retroactive opt-out compliance at the training-data level. This is anticipatory compliance architecture, not minimum-compliance architecture.

The compliance framework · what the technical card actually claims
From the Apertus Hugging Face technical card and the official technical report (arXiv 2509.14233). The architectural choices are designed from first principles for the project’s compliance + transparency + multilingual objectives.
▲ APERTUS HUGGING FACE TECHNICAL CARD · COMPLIANCE COMMITMENT
Apertus is trained while respecting opt-out consent of data owners (even retrospectively), and avoiding memorization of training data.
— Apertus-70B-2509 · swiss-ai · Hugging Face model card · September 2025
Retroactive robots.txt opt-out compliance
January 2025 robots.txt opt-out preferences applied to web scrapes from prior crawls. A website that adds an LLM opt-out before January 2025 has its prior-scraped content removed from the training corpus. Anticipatory regulatory architecture.
EU AI Act
Art. 53/56
Goldfish Loss objective
Replaces standard cross-entropy. Designed specifically to reduce verbatim memorization of training data. Privacy-preserving and copyright-respecting at the architectural level rather than policy-layer.
Memorization
avoidance
xIELU activation function
Huang & Schlag, 2025. Extends Squared ReLU to handle negative inputs · trainable scalars per layer. ~20% kernel execution speedup achieved through CUDA kernel optimization by CSCS engineers.
Novel arch
contribution
AdEMAMix optimizer + QRPO alignment + WSD schedule
AdEMAMix replaces AdamW with long-term EMA momentum. QRPO post-training alignment. Warmup-Stable-Decay schedule allows continuous training without specifying full length in advance. 30-40% fewer tokens vs Llama-style baseline in ablations.
Novel training
recipe
The structural argument: Compliance can be architectural, not policy-layer. Most commercial AI labs treat compliance as a policy-and-content-moderation overlay on top of an architecture trained without compliance constraints. Apertus inverts this — compliance is the foundational design constraint, and the architecture is built to operationalize it. As EU AI Act enforcement matures, this architectural-compliance model becomes a competitive moat that scales with regulatory enforcement. No commercial model can retrofit retroactive opt-out compliance without retraining from scratch.
The operational validation · Canton of Ticino migration · March 2026
Large Language Models: The Hard Parts: Open Source AI Solutions for Common Pitfalls

Large Language Models: The Hard Parts: Open Source AI Solutions for Common Pitfalls

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Mixtral → Apertus. The procurement signal.

A Swiss canton with an existing functional Mistral/Mixtral deployment deliberately migrated to Apertus in March 2026. The migration is not driven by capability superiority — Mixtral is operationally a stronger general-capability model. The migration is driven by ethical-training-data, “trained in Switzerland,” and on-premise sovereignty considerations.

Canton of Ticino · in-house AI translation tool · Artificialy fine-tune of Apertus-8B
From EPFL coverage of the Ticino deployment (March 17, 2026). The Cantonal Computer Systems Center (CSI) hosts the tool on-premise. First phase: ~100 cantonal employees. Languages: Swiss official languages + Romanian + Ukrainian.
▲ PREVIOUSLY · COMMERCIAL-FRONTIER
Mixtral
Mistral AI’s open-weight MoE model · Apache 2.0 · stronger general capability · functioning production deployment
▲ MIGRATED TO · ARCHITECTURAL-COMPLIANCE
Apertus-8B fine-tune
Artificialy-built fine-tune for Ticino · on-premise CSI data center · retroactive opt-out compliance · trained in Switzerland
▲ Rudi Belotti · Head of systems · CSI Cantonal Computer Systems Center · Ticino
As a public administration, we feel obligated to use ethical software applications. With Apertus we can be sure the model was trained in Switzerland and in accordance with the highest ethical standards, meaning it uses data that were not proprietary or copyright-protected but released for AI training. In addition, with this solution the canton gains sovereignty over its translation procedures, as both the hardware and the AI solution are located on-site rather than in data centres outside Switzerland.
— Rudi Belotti · CSI Ticino · March 2026 · explaining Mixtral → Apertus migration rationale
The procurement signal: European public-sector institutions prefer ethical-architecture + sovereignty + on-premise deployment over raw capability when the procurement context is regulated. Apertus is operationally winning this comparison in real procurement decisions. This is the migration pattern that European regulated institutions will increasingly send as EU AI Act enforcement matures.
Six-way comparison · the essay track extends
The Developer's Playbook for Large Language Model Security: Building Secure AI Applications

The Developer's Playbook for Large Language Model Security: Building Secure AI Applications

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Six answers. Six structural findings.

Extending the five-way comparison from Essay 05 with the Apertus federal-research-institution case. Apertus is the only project of the six that explicitly does not target Position 1 (frontier-match). Not because it pivoted away or came up short — because the foundational design principles prioritize architectural-compliance + transparency + multilingual coverage over frontier capability.

Six operational answers · six structural findings · the essay track extends
Italian from-scratch. Portuguese continuation. Pan-European consortium. French commercial-frontier. German enterprise-sovereignty pivot. Swiss federal-research-institution architectural template. Each answer surfaces a structural complication the press coverage downplays. Apertus is the architectural reference the other five can build on.
▲ IT · 02
Minerva
FundingPNRR
PhaseOngoing
FINDING4.9% INVALSI
▲ PT · 01
AMÁLIA
Funding€5.5M
PhaseFinal Jun ’26
FINDING5.5% pt-PT
▲ EU · 03
OpenEuroLLM
Funding€37.4M EU
PhaseFirst Jul ’26
FINDING“more compute”
▲ FR · 04
Mistral
Funding€3B+ VC
Phase$400M ARR
FINDING~44% GPQA
▲ DE · 05
Aleph Alpha
Funding€110M eq
PhaseCohere Apr’26
FINDINGPivot late
▲ CH · 06
Apertus
FundingETH Board
PhaseOperating · Ticino
FINDING31% MMLU-Pro

Six projects. Six findings. Each one harder than the framing it’s wrapped in. Apertus is the architectural reference template the other five projects can build on — not as a competitor but as a foundational architecture European sovereign-AI initiatives can adapt, fine-tune, and specialize.

Five strategic lessons · what the Apertus case demonstrates
Engineering a Small AI Language Model: Training, Evaluation, and Deployment Without Myth

Engineering a Small AI Language Model: Training, Evaluation, and Deployment Without Myth

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Five lessons. The architectural template.

Strategic lessons the European sovereign-AI movement should integrate. Apertus contributes the architectural reference template that demonstrates Position 2 + Position 4 is buildable from first principles when designed correctly from inception.

Five strategic lessons · what the Apertus case demonstrates for European AI
Apertus is what European sovereign-AI looks like when the strategic positioning is built into the institutional structure from inception. The strategic-positioning recommendation from Essays 04-05 is now operationally validated by six independent institutional implementations.
01Compliance
Compliance can be architectural, not policy-layer
Retroactive opt-out + Goldfish loss + memorization avoidance demonstrates EU AI Act compliance implementable at training-architecture level. As regulatory enforcement matures, architectural-compliance becomes a competitive moat that scales with enforcement. No commercial model can retrofit retroactive opt-out without retraining from scratch.
02Institution
The federal-research-institution model is institutionally viable
EPFL + ETH Zürich + CSCS coordinated through the ETH Board with Swisscom partnership demonstrates European AI infrastructure buildable outside venture-capital, consortium-grant, national-government, and commercial-pivot institutional models. A fifth institutional structure to evaluate alongside the four documented in Essays 01-05.
03Languages
Multilingual scale is achievable when designed from first principles
1,811 natively supported languages with 40% non-English training data demonstrates genuine multilingual AI buildable when commitment is foundational rather than retrofitted. Aligns naturally with EU linguistic-diversity requirements (24 official + minority) without retrofit. Template for subsequent European multilingual development.
04Deployment
Public-good infrastructure deployment is operationally viable
Public AI deployment with 115,000+ GPU-hours across 20 clusters in 5+ countries (AWS, Exoscale, AI Singapore, Cudo Compute, CSCS, NCI Australia) demonstrates public-good AI infrastructure buildable at international scale. Structurally distinct from commercial-API deployment. European sovereign-AI should support public-good deployment alongside commercial options.
05Ceiling
The structural ceiling is real even with first-principles architecture
Apertus-8B-Instruct at MMLU-Pro 31.14% is well below frontier-class models. Architectural rigor, retroactive opt-out compliance, 1,811-language coverage, and 4,096-GPU training do not eliminate the structural ceiling that the prior five projects also encounter. Validates the Position 2 + Position 4 recommendation from Essays 04-05.

The work is real across all six projects. The architectural template is real. The structural ceiling is real. All of these can be true at once. Apertus is the architectural reference template the other five projects can build on — not as a competitor but as a foundational architecture European sovereign-AI initiatives can adapt, fine-tune, and specialize. The European AI strategic discourse should integrate all of them simultaneously rather than collapsing the analysis into single-answer triumphalism, single-failure pessimism, or single-architecture exceptionalism.

— Standalone Essay 06 · The Apertus case · the architectural template · May 2026
Source dossier · the receipts
Colophon · Standalone Essay 06

Set in Source Serif 4 (display), EB Garamond (essay body), IBM Plex Sans & IBM Plex Mono. Standalone essay register · not part of the security franchise. The architectural reference template extending the five-way essay track to six-way comparison with the Swiss federal-research-institution case. Free to embed with attribution.

thorstenmeyerai.com

Standalone essay 06 · European sovereign AI · the Apertus case · May 2026

1,811 LANGUAGES · 15T TOKENS · 4,096 GPUs ALPS · RETROACTIVE OPT-OUT · TICINO MIGRATION


I · What Apertus actually is · the institutional and technical foundation

The factual baseline before the structural argument. From the ETH Zürich press release, Swisscom’s coverage, the Apertus technical report, the Hugging Face model card, GGBa’s analysis, BABL AI’s regulatory framing, and the Wikipedia profile.

The institutional architecture

Apertus is developed by the Swiss AI Initiative — a federal-research-institution joint collaboration between three Swiss institutions:

  • EPFL · École Polytechnique Fédérale de Lausanne · French-language Swiss federal institute of technology
  • ETH Zürich · Eidgenössische Technische Hochschule Zürich · German-language Swiss federal institute of technology · one of the world’s leading STEM research universities
  • CSCS · Swiss National Supercomputing Centre · Lugano · operates the Alps supercomputer · part of the ETH Domain

The institutional positioning is structurally distinct. EPFL and ETH Zürich are both part of the ETH Domain — the Swiss federal higher-education system that also includes WSL (forest research), PSI (particle physics), Empa (materials science), and Eawag (water research). Apertus is funded by the ETH Board (strategic management of the ETH Domain) and complemented by contributions from strategic partners, most notably Swisscom — Switzerland’s largest telecommunications provider. This is federal-research-institution funding, not national-government funding, not EU-Commission funding, not venture capital, not consortium-EU-grant. It is structurally distinct from every other project in the six-way comparison.

The leadership team

The project leadership reflects the federal-research-institution model:

  • Martin Jaggi · Professor of Machine Learning at EPFL · member of the Steering Committee of the Swiss AI Initiative
  • Thomas Schulthess · Director of CSCS · Professor at ETH Zürich
  • Imanol Schlag · Technical Lead of the Apertus project · Research Scientist at ETH Zürich
  • Antoine Bosselut · Professor at EPFL · Co-Lead of the Swiss AI Initiative · head of EPFL’s Natural Language Processing Laboratory

The leadership-team statements crystallize the strategic positioning. From the ETH Zürich press release:

Martin Jaggi (EPFL): “With this release, we aim to provide a blueprint for how a trustworthy, sovereign, and inclusive AI model can be developed.”

Thomas Schulthess (CSCS): “Apertus is not a conventional case of technology transfer from research to product. Instead, we see it as a driver of innovation and a means of strengthening AI expertise across research, society and industry.”

Imanol Schlag (ETH Zürich): “Apertus is built for the public good. It stands among the few fully open LLMs at this scale and is the first of its kind to embody multilingualism, transparency, and compliance as foundational design principles.”

Antoine Bosselut (EPFL): “The beginning of a journey, a long-term commitment to open, trustworthy, and sovereign AI foundations.”

The deliberate language is structurally important. “Blueprint” (Jaggi). “Public good” (Schlag). “Not a conventional case of technology transfer” (Schulthess). “Long-term commitment to open, trustworthy, and sovereign AI foundations” (Bosselut). The framing positions Apertus as architectural reference template, not commercial product. This is the federal-research-institution model speaking explicitly about what it is and is not.

The technical scope · two models at 8B and 70B

Released September 2, 2025 in two configurations:

  • Apertus-8B · 8 billion parameters · optimized for individual use, fine-tuning, edge deployment
  • Apertus-70B · 70 billion parameters · cutting-edge performance for deployment at scale
  • Both Apache 2.0 license · unrestricted use in research, education, and commercial projects

The 70B model is the first fully open model trained at this scale, per the technical report — “Our Apertus-70B model is the first fully open model to be trained at this scale – 70B parameters trained on 15T tokens.” This positions Apertus-70B as architecturally landmark within the open-data + open-weights + open-methodology category.

Training compute scale:

  • 15 trillion tokens total pretraining
  • Trained on up to 4,096 GPUs on the Alps supercomputer at CSCS in Lugano
  • 10+ million GPU hours invested by CSCS specifically for the Apertus project
  • Decoder-only transformer with staged curriculum of web, code, and math data
  • 65,536 token context window (initial 4,096 then extended via long-context phase)
  • Grouped-Query Attention (GQA) for efficient inference

For scale context: Mistral Large 3 was trained on 3,000 NVIDIA H200 GPUs per Essay 04. Apertus-70B was trained on up to 4,096 GPUs on the Alps supercomputer — comparable order of magnitude. This is not toy-scale research training. The Apertus project operates at training-compute scales that compete with commercial-frontier developers within the same order of magnitude.

Novel architectural innovations

From the Apertus technical report and the HuggingFace technical card, Apertus contributes several architectural and training innovations to the open-research literature. These are not retrofitted to an existing architecture — they are designed from first principles for the project’s compliance + transparency + multilingual objectives.

  • xIELU activation function (Huang & Schlag, 2025) — extends Squared ReLU to handle negative inputs · trainable scalars per layer · approximately 20% kernel execution speedup achieved through CUDA kernel optimization by CSCS engineers
  • AdEMAMix optimizer — replaces conventional AdamW · adds long-term Exponential Moving Average momentum vector · faster convergence for long training runs
  • QRPO alignment — quantile-based reward preference optimization for post-training
  • Goldfish Loss objective — replaces standard cross-entropy · designed specifically to reduce verbatim memorization of training data · prevents regurgitation of training content
  • Warmup-Stable-Decay (WSD) learning rate schedule — allows continuous training without specifying full length in advance · linear warmup for 16.8B tokens followed by 1-sqrt cooldown
  • QK-Norm — normalizes queries and keys in attention layers for training stability
  • Cross-document attention prevention — attention masks prevent tokens from attending across documents within the same context window
  • No bias terms removed throughout the architecture
  • Untied embeddings — input embedding weights not tied to output embedding weights · improves performance at memory cost

The architectural choices are not incidental. Goldfish loss specifically addresses verbatim memorization — which is structurally aligned with the project’s compliance framework (more on this below). The xIELU + AdEMAMix combination is novel and was empirically validated during ablations on 1.5B and 3B models that demonstrated “the same training loss with 30-40% fewer tokens compared to a Llama-style baseline” per the technical report. This is genuine architectural research contribution to the open AI literature, not just incremental engineering.

Multilingual coverage · 1,811 languages

The multilingual scope is quantitatively unprecedented among the six projects examined:

  • 1,811 natively supported languages per the Hugging Face technical card
  • More than 1,000 languages trained on per the press release (1,811 is the precise figure from the technical card)
  • 40% of training data is non-English
  • Explicitly included underrepresented languages: Swiss German, Romansh (Switzerland’s fourth official language with approximately 60,000 speakers), and many others
  • Training data sources: FineWeb variants, StarCoder, FineMath, CommonPile (public portion)

For scale context in the six-way comparison: OpenEuroLLM targets 35 EU languages. Mistral Large 3 supports 40+ languages. Apertus supports 1,811. This is structurally distinct. Apertus’s multilingual commitment is not about EU-language coverage — it is about global linguistic inclusion as a foundational design principle, with deliberate prioritization of underrepresented languages that no commercial model attempts at scale.

This positioning has implications. For European sovereign-AI specifically, the 1,811-language commitment means Apertus naturally serves the EU’s linguistic-diversity requirements (all 24 official EU languages + minority languages) without needing to retrofit multilingual capability. For global non-EU users, the 1,811-language commitment makes Apertus structurally distinct from any frontier-class commercial model. This is a positioning that aligns with Switzerland’s federal multilingualism (German, French, Italian, Romansh as official languages) — institutional values reflected in technical architecture.


II · The compliance framework · the architectural innovation

This is the section that justifies positioning Apertus as the architectural reference template. The compliance framework is the single most important technical-policy contribution Apertus makes to European sovereign-AI development.

What “compliant” means in the Apertus technical card

From the Hugging Face model card:

Compliant: “Apertus is trained while respecting opt-out consent of data owners (even retrospectively), and avoiding memorization of training data.”

This single sentence describes two structural innovations operating together:

Innovation 1 · Retroactive opt-out compliance. Per the technical report:

“The pretraining corpus was compiled solely from web data, respecting robots.txt not only at crawl time (January 2025), but also retroactively applying January 2025 opt-out preferences to web scrapes from previous crawls.”

Parse this carefully. Conventional LLM training treats robots.txt opt-out as a crawl-time check — if a website opts out before the crawl, the crawler skips that content. Apertus does this AND also retroactively applies January 2025 opt-out preferences to data scraped from prior crawls. A website that adds an LLM opt-out to its robots.txt after the original crawl, but before January 2025, has its prior-scraped content removed from the Apertus training corpus. This is operationally significant because it implements a respect-the-current-preferences principle rather than a respect-the-historical-snapshot principle for opt-out compliance.

Why this matters for European regulatory framework: EU AI Act Article 53 and Article 56 require providers of general-purpose AI models to publish a summary of training data and respect opt-out signals. No conventional LLM training implements retroactive opt-out compliance. The Apertus retroactive opt-out commitment goes beyond what current regulatory framework requires — it implements what the regulatory framework will likely require in subsequent iterations as opt-out enforcement matures. This is anticipatory compliance architecture, not minimum-compliance architecture.

Innovation 2 · Memorization avoidance via Goldfish loss. The Goldfish Loss objective (replacing standard cross-entropy) is designed specifically to reduce verbatim memorization of training data. This addresses a different aspect of compliance from opt-out — even for content that is licensed for training, Goldfish loss reduces the likelihood that the model will regurgitate the content verbatim at inference time. This is privacy-preserving and copyright-respecting at the architectural level.

The combined framework — retroactive opt-out + Goldfish loss + exclusion of personal data + exclusion of non-permissive content — produces a training corpus and resulting model that operates at a substantively higher compliance standard than any commercial model currently produces. The compliance is architectural, not policy-layer. This is structurally important for European regulated industries (finance, healthcare, public sector) where model-level compliance verification is a procurement requirement, not a corporate-policy reassurance.

The Public AI deployment architecture · international compliance infrastructure

The deployment side of the compliance framework is equally architecturally significant. Per the Public AI Inference Utility:

“Public AI is proud to be the official international deployer for Apertus. To support Apertus, we’ve allocated over 115,000 GPU-hours spread across 20 clusters in 5+ countries—just for the month of September.”

Public AI’s deployment partners include AWS, Exoscale, AI Singapore, Cudo Compute, CSCS (Switzerland), and NCI Australia. This is institutional compliance infrastructure at international scale — a deployment consortium that operates Apertus as a public utility rather than a commercial API.

For European sovereign-AI specifically, the Public AI deployment model is structurally distinct from the commercial-API model that Mistral, OpenAI, Anthropic, and Cohere use. Apertus is available as public-good infrastructure alongside commercial deployment via Swisscom. This is a deployment architecture that aligns with the federal-research-institution institutional model — public-infrastructure-with-commercial-option rather than commercial-product-with-academic-license.

What this means for the European sovereign-AI strategic discourse

Three structural implications worth surfacing:

Implication 1 · Compliance can be architectural, not policy-layer. The Apertus case demonstrates that EU AI Act compliance + Swiss data protection law + retroactive opt-out respect + memorization avoidance can all be implemented at the training-data and architectural level. This is not the conventional approach. Most commercial AI labs treat compliance as a policy-and-content-moderation overlay on top of an architecture trained without compliance constraints. Apertus inverts this — compliance is the foundational design constraint, and the architecture is built to operationalize it.

Implication 2 · The European regulatory framework structurally favors architectures designed from first principles for compliance. As EU AI Act enforcement matures through 2026 and 2027, the difference between architectural-compliance and policy-layer-compliance will become operationally meaningful for regulated procurement. Apertus is positioned to win this comparison structurally. No commercial model can retrofit retroactive opt-out compliance without retraining from scratch — which is what Apertus did from the start. This is a competitive moat that scales with regulatory enforcement.

Implication 3 · The federal-research-institution model is the institutional structure that produces architectural-compliance AI. Commercial AI labs structurally cannot afford the time and compute cost of from-scratch compliance-first architecture because their commercial timelines do not permit it. The federal-research-institution model can afford it because its institutional clock is structurally different — public-good infrastructure operates on multi-year compliance timelines rather than quarterly product release cycles. This is the institutional structure that produces architecturally-compliant AI — and it is what the European sovereign-AI movement should explicitly support as a complement to the commercial-frontier path (Mistral) and the consortium path (OpenEuroLLM).


III · The benchmark reality · the structural complication

Now the part of the Apertus story that the architectural-rigor and compliance framing tends to elide. This is not a critique of Apertus. It is the empirical reality the European sovereign-AI strategic discourse should integrate.

The independent benchmark results

From DS-NLP Lab’s independent February 2026 evaluation of Apertus-8B-Instruct:

  • MMLU-Pro (12,032 samples, multi-domain knowledge): 31.14%
  • Math-lvl-5 (advanced mathematical reasoning): 5.29% (beats Mistral 7B at 2.95%)
  • Musr (multi-step reasoning): 36.00% overall · particularly weak on object placement (24.00%)

Compare to the structural ceiling documented across the six-way comparison:

  • Mistral Large 3 (Essay 04): MMLU-Pro 73.11% (LayerLens/Atlas) · GPQA Diamond ~44% (vs Gemini 3 Pro at 91.9%)
  • Gemini 3 Pro / GPT-5.4 / Claude Opus 4.6: market-leading frontier-class performance

Apertus-8B at 31.14% MMLU-Pro is well below frontier-class models. This is partially attributable to the model size (8B is small relative to frontier-class 70B+ models) and partially attributable to the architectural-compliance trade-off (Goldfish loss + retroactive opt-out + multilingual prioritization shifts capability allocation away from English-language benchmark maximization). The Apertus-70B benchmark numbers are not publicly available at the same independent-evaluation level — the Apertus technical report claims “approaches state-of-the-art results among fully open models on multilingual benchmarks, rivalling or surpassing open-weight counterparts” but does not provide head-to-head Apertus-70B vs Mistral Large 3 comparisons.

What this empirical reality implies

Three implications worth surfacing:

Implication 1 · The structural ceiling documented across the six-way comparison is real for Apertus too. Apertus is the most architecturally rigorous of the six projects examined — and it still operates below frontier-class capability on the hardest reasoning benchmarks. This is the bitter lesson surfacing in federal-research-institution context. The capability gap between European sovereign-AI projects and US frontier developers does not appear to be solvable through architectural rigor alone, just as it did not appear to be solvable through capital scale (Mistral), through institutional structure (OpenEuroLLM consortium), or through enterprise positioning (Aleph Alpha).

Implication 2 · The Apertus positioning works specifically because frontier-class capability is not the metric it optimizes for. Apertus optimizes for: (a) full transparency and reproducibility, (b) retroactive opt-out compliance, (c) memorization avoidance, (d) 1,811-language coverage, (e) public-good infrastructure deployment. For European sovereign-AI applications that prioritize these dimensions, Apertus is operationally superior to commercial alternatives regardless of raw MMLU-Pro ranking. This is the Position 2 + Position 4 strategic recommendation from Essays 04-05 made into a complete architectural template.

Implication 3 · The Apertus reference architecture is the foundation other European AI initiatives can build on. This is the unstated structural implication of the federal-research-institution model. Apertus is not designed to be a competitive end-user product — it is designed to be a reference architecture that other European AI projects can adapt, fine-tune, and specialize. The Canton of Ticino translation deployment (Section IV below) demonstrates this operationally. Apertus’s contribution to European sovereign-AI is the architectural template + the compute infrastructure + the compliance framework + the open methodology, not necessarily the end-user model capability.


IV · The Canton of Ticino case · operational deployment

The most important real-world Apertus deployment to date crystallizes what the federal-research-institution model produces operationally. From EPFL’s coverage of the Ticino deployment, reported March 2026.

What the Canton of Ticino built

Artificialy, a Ticino-based AI company, built a fine-tuned version of Apertus-8B for the Canton of Ticino (the Italian-speaking Swiss canton). The deployment:

  • In-house AI translation tool hosted in CSI’s (Cantonal Computer Systems Center) own data center
  • On-premise GPU infrastructure installed specifically for local AI deployment
  • First phase: approximately 100 cantonal employees who require translations daily testing the tool
  • Languages: Switzerland’s official languages (German, French, Italian, Romansh) + Romanian + Ukrainian (additional languages reflecting Ticino’s immigration patterns and refugee-support requirements)

The strategic substance of the migration

Per EPFL’s coverage, the Canton of Ticino’s translation tool previously used Mixtral — the open-weight model from Mistral AI (Essay 04). The deliberate migration from Mixtral to Apertus is structurally significant. Rudi Belotti, head of systems and user support at the Cantonal Computer Systems Center (CSI), explained the rationale:

“As a public administration, we feel obligated to use ethical software applications. With Apertus we can be sure the model was trained in Switzerland and in accordance with the highest ethical standards, meaning it uses data that were not proprietary or copyright-protected but released for AI training. In addition, with this solution the canton gains sovereignty over its translation procedures, as both the hardware and the AI solution are located on-site rather than in data centres outside Switzerland.”

Parse this carefully. A Swiss canton with an existing functional Mistral/Mixtral deployment deliberately migrated to Apertus because of: (a) ethical training-data standards (the retroactive opt-out compliance), (b) “trained in Switzerland” (the federal-research-institution model), (c) sovereignty over data location (on-premise hosting in canton’s own data center). The migration is not driven by capability superiority — Mixtral is operationally a stronger general-capability model. The migration is driven by Position 2 + Position 4 + ethical-architecture considerations that the Apertus framework explicitly addresses.

This is the operational demonstration of the strategic-positioning recommendation from Essays 04-05. The Canton of Ticino’s procurement decision validates the structural argument that European public-sector institutions prefer ethical-architecture + sovereignty + on-premise deployment over raw capability when the procurement context is regulated. Apertus is operationally winning this comparison in real procurement decisions.

What this case demonstrates for the broader European AI movement

The Ticino case is small in absolute scale (~100 cantonal employees in first phase) but structurally important for two reasons:

Reason 1 · It demonstrates the migration pattern from commercial-frontier to architectural-compliance. The Canton of Ticino had a working Mistral/Mixtral deployment and deliberately chose to migrate. This is the procurement signal that European regulated institutions will increasingly send as EU AI Act enforcement matures. Apertus is structurally positioned to win these migrations because its architectural-compliance framework is what regulated procurement evaluates against.

Reason 2 · It demonstrates that fine-tuning is the operational deployment pattern, not direct base-model use. Artificialy built a fine-tuned Apertus-8B specifically for the Ticino translation use case. This is what the federal-research-institution model produces operationally — reference architecture + open methodology + permissive licensing that local AI companies can fine-tune for specific applications. The deployment ecosystem around Apertus is the architectural validation, not the base model alone.


V · The six-way comparison · what Apertus adds to the essay track

With Apertus now documented, the five-way structural comparison from Essay 05 extends to a six-way framework. Apertus contributes a structurally distinct sixth answer that the prior five do not represent.

The institutional comparison

DimensionMinervaAMÁLIAOpenEuroLLMMistralAleph AlphaApertus
Strategic answerNational from-scratchNational continuationPan-EU consortiumCommercial-frontierEnterprise-sovereignty pivotFederal-research-institution
Institutional modelAcademic (Sapienza/FAIR)Academic consortiumAcademic-and-state consortiumVenture-funded privateVenture-funded private (pivoted)Swiss federal research (EPFL + ETH + CSCS)
CountryItalyPortugalPan-EUFranceGermanySwitzerland (outside EU)
LeadRoberto Navigli(consortium)Hajič + SarlinMensch / Lample / LacroixAndrulis / WeinbachJaggi / Schulthess / Schlag / Bosselut
Funding modelPNRR · large national€5.5M state€37.4M EU€3B+ VC€110M+ equity → Cohere mergerETH Board + CSCS + Swisscom
Compute (flagship)128 GPUs LeonardoNot detailed4.5M+ GPU hours EuroHPC3,000 H200alpha ONE 512 A100Up to 4,096 GPUs Alps · 10M+ GPU hours
Largest modelMinerva-7BEuroLLM derivativeTBD · 8B summer 2026Mistral Large 3 (41B/675B MoE)Luminous 70B + Pharia-1 7BApertus-70B + Apertus-8B
Training data2.5T IT+EN107B → 5.8B pt-PTTBDNot publicly detailedNot publicly detailed15T tokens · 1,811 languages · 40% non-English
LicensingTruly-openPartially openTruly-open with EU-copyrightApache 2.0 (methodology proprietary)Apache 2.0 (Pharia-1)Apache 2.0 + open data + open methodology
Compliance postureAcademicAcademicEU AI Act targetCommercial GDPRCommercial GDPRRetroactive opt-out + Goldfish loss + memorization avoidance · architectural
OutcomeOperating · ongoingOperating · 2026 finalOperating · first models 2026Operating · €3B+ trajectoryCohere merger April 2026Operating · Canton of Ticino deployment

Apertus is structurally distinct from the prior five in five material ways: institutional model (federal-research-institution vs national/commercial/consortium/pivot), country positioning (Switzerland · outside EU · inside European regulatory sphere), licensing posture (open data + open methodology, not just open weights), compliance architecture (retroactive opt-out + Goldfish loss · architectural rather than policy-layer), and multilingual scale (1,811 languages vs 35-40 for the others).

The strategic-positioning comparison

ProjectPosition 1 (frontier-match)Position 2 (sovereignty/openness)Position 3 (country-knowledge depth)Position 4 (vertical specialization)
MinervaNot targeted✓ Operational✓ Italian-specificNot primary
AMÁLIANot targetedPartialPartialNot primary
OpenEuroLLMStated · compute-constrained✓ Strong commitment✓ 35 languagesNot primary
MistralStrongest attempt · still trails✓ Apache 2.0 + EU-hostedPartial (40+ languages)✓ Multiple verticals
Aleph AlphaAttempted · pivoted out✓ Strong commitment✓ German T-Free✓ Public sector + defense + industrial
ApertusNot targeted (explicitly)✓ Architecturally implemented · the reference template✓ 1,811 languagesDomain-specific versions planned (law / climate / health / education)

Apertus is the only project of the six that explicitly does not target Position 1 — not because it pivoted away (Aleph Alpha) or because the competitor came up short (Mistral, OpenEuroLLM), but because the project’s foundational design principles prioritize architectural-compliance + transparency + multilingual coverage over frontier capability. This is the strategic-positioning recommendation from Essays 04-05 made into the project’s founding charter. Apertus is what European sovereign-AI looks like when the strategic positioning is built into the institutional structure from inception.

The temporal comparison

Apertus joins the six-way comparison with a specific operational timeline:

  • September 2, 2025 · Apertus 8B + 70B launched
  • September 2025 · Public AI Inference Utility deployment with 115,000+ GPU-hours across 20 clusters in 5+ countries
  • September 17, 2025 · Apertus technical report published (arXiv 2509.14233)
  • October 5, 2025 · Swiss {ai} Weeks hackathons conclude
  • February 2026 · DS-NLP Lab independent benchmark evaluation published
  • March 2026 · Canton of Ticino translation deployment (Artificialy fine-tune)
  • Planned · Domain-specific versions for law, climate, health, education

Apertus is the operational complement to the other forward-looking projects in the six-way comparison. It demonstrates that the architectural reference template is buildable, the federal-research-institution model is institutionally viable, the compliance framework operates at production deployment, and the migration pattern from commercial-frontier (Mixtral) to architectural-compliance (Apertus) is operationally tractable for regulated procurement.


VI · What Apertus demonstrates beyond the project itself

Five structural lessons emerge from the Apertus case that the six-way essay track should integrate.

Lesson 1 · Compliance can be architectural, not policy-layer

The Apertus retroactive opt-out + Goldfish loss + memorization avoidance framework demonstrates that EU AI Act compliance can be implemented at the training-architecture level rather than as policy-and-content-moderation overlay. This is the most important technical-policy contribution of any of the six projects examined. No commercial AI lab implements retroactive opt-out compliance at the training-data level. As EU AI Act enforcement matures, the architectural-compliance model will become a competitive moat that scales with regulatory enforcement. European sovereign-AI initiatives should explicitly evaluate adopting Apertus-style architectural compliance rather than retrofitting policy-layer compliance onto non-compliance-first architectures.

Lesson 2 · The federal-research-institution model is institutionally viable

The Swiss AI Initiative — EPFL + ETH Zürich + CSCS coordinated through the ETH Board with Swisscom strategic partnership — demonstrates that European AI infrastructure can be built outside the venture-capital, consortium-EU-grant, national-government, and commercial-pivot institutional models. The federal-research-institution model is structurally distinct from the others and produces structurally distinct outputs: architectural-compliance, public-good infrastructure, true open data, long-term commitment, regular updates by federal-research engineering teams. The European AI strategic discourse should recognize this as a fifth institutional structure to evaluate alongside the four documented in Essays 01-05.

Lesson 3 · Multilingual scale is achievable when designed from first principles

Apertus’s 1,811-language coverage with 40% non-English training data demonstrates that genuine multilingual AI is buildable when the multilingual commitment is foundational rather than retrofitted. Most LLMs are English-first with multilingual extensions. Apertus is multilingual-first by design — and the resulting model serves underrepresented languages (Swiss German, Romansh) that no commercial frontier developer attempts. For European sovereign-AI specifically, the multilingual scale aligns naturally with EU linguistic-diversity requirements (24 official languages + minority languages) without retrofit. The Apertus approach should be the template for subsequent European multilingual AI development.

Lesson 4 · Public-good infrastructure deployment is operationally viable

The Public AI Inference Utility deployment of Apertus (115,000+ GPU-hours across 20 clusters in 5+ countries for September 2025 alone, with AWS, Exoscale, AI Singapore, Cudo Compute, CSCS, NCI Australia as deployment partners) demonstrates that public-good AI infrastructure is buildable at international scale. This is structurally distinct from the commercial-API deployment model that Mistral, OpenAI, Anthropic, and Cohere use. The European sovereign-AI movement should explicitly support public-good deployment infrastructure alongside commercial deployment options. Public AI’s deployment architecture is the institutional model for European-aligned public AI services that scale beyond what individual nations can support.

Lesson 5 · The structural ceiling is real even when the architecture is designed from first principles

Apertus-8B-Instruct at MMLU-Pro 31.14% is well below frontier-class commercial models. The Apertus case demonstrates that the structural capability gap with US frontier developers is real even when the project is designed from first principles for European sovereign-AI requirements. The architectural rigor, retroactive opt-out compliance, 1,811-language coverage, and 4,096-GPU training run do not eliminate the structural ceiling that the prior five projects also encounter. This validates the strategic-positioning recommendation from Essays 04-05: stop trying to match US frontier developers on raw capability. Focus on the dimensions European regulatory framework and architectural rigor create competitive advantage on. Apertus is what doing that operationally looks like.


VII · The closing argument · the six-way comparison and what comes next

Across six standalone essays, the European sovereign-LLM essay track now documents six distinct institutional answers:

  • Essay 01 · AMÁLIA · national continuation answer · 5.5% pt-PT finding
  • Essay 02 · Minerva · national from-scratch answer · 4.9% INVALSI finding (Minerva-3B)
  • Essay 03 · OpenEuroLLM · pan-European consortium answer · Hajič compute statement
  • Essay 04 · Mistral · commercial-frontier answer · ~44% GPQA Diamond finding (vs 91.9% Gemini 3 Pro)
  • Essay 05 · Aleph Alpha · enterprise-sovereignty pivot answer · Andrulis Handelsblatt retrospective acknowledgment + Cohere merger
  • Essay 06 (this piece) · Apertus · federal-research-institution answer · 31.14% MMLU-Pro finding (Apertus-8B) + retroactive opt-out compliance + 1,811-language coverage + Canton of Ticino deployment

Each answer surfaces a structural complication the press coverage downplays. Each answer demonstrates a different aspect of what European sovereign-AI development actually looks like operationally. The six-way comparison is now the most comprehensive structural framework available for evaluating European sovereign-AI strategic positioning.

The deepest integrative observation the six-way track produces: the structural capability gap with US frontier developers appears to be real across every institutional model documented, but the European competitive advantage is real across every project documented too. The Position 2 + Position 4 strategic-positioning recommendation that Essays 04-05 articulated is now operationally validated by six independent institutional implementations. Apertus is the architectural template that demonstrates how to operationalize Position 2 + Position 4 from first principles. The other five projects demonstrate how to operationalize it through different institutional paths.

For Apertus specifically, the trajectory through 2026 and 2027 will depend on: (a) whether the domain-specific versions for law, climate, health, and education ship and gain adoption, (b) whether subsequent procurement decisions follow the Canton of Ticino pattern (migration from commercial-frontier to architectural-compliance), (c) whether the Public AI deployment infrastructure scales internationally beyond the initial September 2025 commitment, (d) whether the Apertus architectural-compliance framework is adopted by other European AI initiatives as a reference standard. The next data points that matter are the Apertus-70B independent benchmark evaluations, the next major procurement migration following the Ticino pattern, and the domain-specific version launches.

For the European sovereign-LLM movement broadly, the six-way comparison this essay track now contains is what the strategic discourse needs — a structurally honest framework for evaluating European AI development across six institutional models simultaneously, surfacing the empirical complications each project’s marketing materials downplay, integrating both forward-looking cases (AMÁLIA, Minerva, OpenEuroLLM, Mistral, Apertus) and the retrospective case (Aleph Alpha), and producing strategic recommendations grounded in the operational realities of what actually works at current European investment scales.

Apertus’s specific contribution to the six-way framework is the architectural reference template that demonstrates Position 2 + Position 4 is buildable from first principles when designed correctly from inception. Martin Jaggi’s articulation — “a blueprint for how a trustworthy, sovereign, and inclusive AI model can be developed” — is the founding-leadership statement that aligns precisely with what the broader essay track has empirically validated. Apertus is the blueprint operationalized.

The work is real across all six projects. The institutional achievement is substantial across all six. The empirical findings are harder than the press coverage suggests across all six. The Apertus case adds the architectural-compliance template that the other five projects can integrate — not as a competitor but as a reference architecture that European sovereign-AI initiatives can build on. The European AI strategic discourse is at the empirical-data-ground-truth moment. The six-way comparison is what that moment looks like operationally.

That’s the read on Apertus, and that’s the read on the six-way European sovereign-LLM landscape as of mid-May 2026. The work is real. The architectural template is real. The structural ceiling is real. All of these can be true at once. The European sovereign-AI movement should integrate all of them simultaneously rather than collapsing the analysis into single-answer triumphalism, single-failure pessimism, or single-architecture exceptionalism.

The European sovereign-AI agenda is at the empirical-data-ground-truth moment. Summer 2026 is when that moment becomes operational. Apertus is the architectural template the other projects can build on. The discourse should be ready for whatever the data actually shows across all six projects simultaneously, and ready to evaluate the architectural-compliance + multilingual + federal-research-institution model on its own merits rather than against the frontier-capability metrics that no European project of any institutional structure currently meets.


About the Author

Thorsten Meyer is a Munich-based futurist, post-labor economist, and recipient of OpenAI’s 10 Billion Token Award. He spent two decades managing €1B+ portfolios in enterprise ICT before deciding that writing about the transition was more useful than managing quarterly slides through it. More at ThorstenMeyerAI.com.



Sources

  • ETH Zürich · Apertus: a fully open, transparent, multilingual language model · September 2, 2025 launch press release
  • Swisscom · Apertus: Switzerland launches an open-source AI model · September 2, 2025 · strategic-partner perspective
  • Wikipedia · Apertus (LLM) · institutional and release details
  • Public AI · Apertus · international deployment infrastructure (115,000+ GPU-hours, 20 clusters, 5+ countries)
  • Apertus Technical Report · arXiv 2509.14233 · September 17, 2025 · architectural innovations, training methodology, compliance framework
  • Hugging Face · Apertus-70B-2509 model card · technical card with full architectural and compliance specifications
  • GGBa · Switzerland launches Apertus, a fully open multilingual LLM · September 3, 2025
  • Open Data Science · Switzerland Launches Apertus · September 4, 2025 · ODSC coverage
  • Open Source For You · Switzerland Rolls Out Apertus · September 5, 2025
  • BABL AI · Switzerland Unveils Apertus · September 18, 2025 · regulatory compliance framing
  • TechNow · Apertus: Switzerland’s Open, Transparent, and Multilingual LLM · architectural innovations analysis
  • DS-NLP Lab · LLM Benchmark Evaluation – Apertus-8B · February 2026 · independent benchmark evaluation (MMLU-Pro 31.14%, Math-lvl-5 5.29%, Musr 36%)
  • EPFL · Apertus powers in-house AI translation for Ticino · March 17, 2026 · Canton of Ticino deployment case
  • The Moonlight · Apertus: Democratizing Open and Compliant LLMs · Literature Review · detailed architectural analysis
  • aimodels.fyi · Apertus-8B-Instruct-2509 · model details summary
  • Kai Waehner · Enterprise Agentic AI Landscape 2026 · April 7, 2026 · “the reference model for what trustworthy, sovereign, open AI infrastructure looks like at scale”
  • Martin Jaggi · Professor of Machine Learning at EPFL · Steering Committee member of Swiss AI Initiative · “blueprint for trustworthy, sovereign, inclusive AI”
  • Thomas Schulthess · Director of CSCS · Professor at ETH Zürich · “not a conventional case of technology transfer”
  • Imanol Schlag · Apertus Technical Lead · Research Scientist at ETH Zürich · “built for the public good · first of its kind to embody multilingualism, transparency, and compliance as foundational design principles”
  • Antoine Bosselut · Professor at EPFL · Co-Lead of Swiss AI Initiative · head of EPFL Natural Language Processing Laboratory · “beginning of a journey, long-term commitment to open, trustworthy, and sovereign AI foundations”
  • Rudi Belotti · Head of systems and user support at Cantonal Computer Systems Center (CSI) Ticino · explained Mixtral → Apertus migration rationale
  • EPFL · École Polytechnique Fédérale de Lausanne · co-developer
  • ETH Zürich · Eidgenössische Technische Hochschule Zürich · co-developer
  • CSCS · Swiss National Supercomputing Centre · Lugano · co-developer + compute infrastructure
  • ETH Board · strategic management of ETH Domain · primary Apertus funder
  • Swisscom · Switzerland’s largest telecom · strategic partner · sovereign AI platform deployment
  • Public AI · official international deployer · 115,000+ GPU-hours across 20 clusters in 5+ countries
  • Public AI deployment partners · AWS, Exoscale, AI Singapore, Cudo Compute, CSCS, NCI Australia
  • Alps supercomputer · CSCS Lugano · trained Apertus on up to 4,096 GPUs · 10M+ GPU hours invested
  • Apertus-70B · 70 billion parameters · first fully open model trained at this scale · Apache 2.0
  • Apertus-8B · 8 billion parameters · individual use / fine-tuning / edge deployment · Apache 2.0
  • 15 trillion tokens · pretraining corpus · web + code + math staged curriculum
  • 1,811 languages · natively supported · 40% non-English training data · Swiss German + Romansh included
  • 65,536 token context window · long-context capable
  • xIELU activation function (Huang & Schlag, 2025) · novel architectural contribution · trainable scalars per layer
  • AdEMAMix optimizer · replaces AdamW · long-term EMA momentum
  • QRPO alignment · post-training quantile-based reward preference optimization
  • Goldfish Loss · replaces standard cross-entropy · reduces verbatim memorization of training data
  • Warmup-Stable-Decay (WSD) · learning rate schedule · continuous training without specifying full length
  • QK-Norm · query/key normalization · training stability
  • Retroactive robots.txt opt-out compliance · January 2025 opt-out preferences applied to prior crawls
  • Memorization avoidance · architectural via Goldfish loss
  • Data sources · FineWeb variants, StarCoder, FineMath, CommonPile (public portion)
  • Swiss {ai} Weeks · September 2025 – October 5, 2025 · hackathon and developer community engagement
  • Artificialy · Ticino-based AI company · built fine-tuned Apertus-8B for Canton of Ticino
  • Canton of Ticino · Italian-speaking Swiss canton · ~100 cantonal employees testing Apertus-based translation
  • Canton of Ticino migration from Mixtral to Apertus · March 2026 · sovereignty + ethical training data + on-premise hosting rationale
  • Domain-specific versions planned · law, climate, health, education
  • AMÁLIA · Essay 01 · Portuguese national continuation answer
  • Minerva · Essay 02 · Italian national from-scratch answer
  • OpenEuroLLM · Essay 03 · pan-European consortium answer
  • Mistral · Essay 04 · commercial-frontier answer
  • Aleph Alpha · Essay 05 · German enterprise-sovereignty pivot answer · retrospective case
You May Also Like

Projected Surge in U.S. Data Center Power Demand Through 2030 – Risks & Strategies

Executive Summary Data centers are poised to become one of the fastest-growing…

OpenAI Atlas: A New Era of Internet Use and Strategy

Executive Summary OpenAI’s launch of ChatGPT Atlas marks a transformative shift in…

Lisbon’s Data‑Centre Surge: AtlasEdge’s €253‑Million Green Financing and the Rise of a Digital Hub

A new era for Europe’s digital infrastructure For more than a decade…

Building the Parallel Web for AI Agents: Payments, Identity, and Machine-Readable Commerce

By Thorsten Meyer | ThorstenMeyerAI.com | February 2026 Executive Summary AI agents…