By Thorsten Meyer — May 2026

DeepSeek V4-Pro is the largest open-weight model ever released. One trillion parameters. One million tokens of context. Free to download.

One week earlier, Alibaba shipped Qwen 3.6-35B-A3B. Earlier in April, Meta dropped Llama 4 (Scout + Maverick), Mistral released Small 4, Google released Gemma 4, and Zhipu AI open-sourced GLM-5.1.

Six labs. Five competitive open-weight families. One quarter.

The benchmark gap between the best open and the best closed model is now in the single digits on every evaluation enterprises actually pay for.

If you were planning a 2026 AI budget around proprietary API pricing, that budget is wrong.


Executive Summary

Open-Weight ModelLabReleasedDifferentiator
DeepSeek V4-ProDeepSeek (CN)2026-04-23~1T MoE, 1M context, multimodal
DeepSeek V4-FlashDeepSeek (CN)2026-04-23Cheaper inference variant
Qwen 3.6-35B-A3BAlibaba2026-04-16Smaller, fast, MoE
Llama 4 ScoutMeta2026-04109B / 17B active, 16 experts
Llama 4 MaverickMeta2026-04400B raw capability
Gemma 4Google2026-04Open, on-device-friendly
Mistral Small 4Mistral2026-04Apache-2 license
GLM-5.1Zhipu AI (CN)2026-04Open weights

The list is not exhaustive. It is what shipped in a single month.


The NVIDIA Rubin CPX GPU Architecture: Transforming AI Inference Infrastructure for High-Performance Computing and Generative Applications

The NVIDIA Rubin CPX GPU Architecture: Transforming AI Inference Infrastructure for High-Performance Computing and Generative Applications

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

1. The Closed-Model Premium Was Just Re-Priced

For three years, “frontier model” meant “API model.” Closed weights, paid per token, accessible only through the lab that built it. Enterprises paid the premium because the alternative — open models — were measurably worse.

The April 2026 benchmark numbers no longer say that.

Eval CategoryClosed Frontier (Mar 2026)Best Open Weight (Apr 2026)Gap
Reasoning (MATH, GSM8K)95.192.42.7 pts
Code (HumanEval, MBPP)94.891.23.6 pts
Long-context retrieval (128K+)89.387.81.5 pts
Multimodal (MMMU)76.471.15.3 pts
Tool use / agentic (TAU-bench)82.177.54.6 pts

A 3-point gap on a benchmark does not justify a 30× pricing differential at the API.

For a CTO running a customer-support agent at scale, the math now reads: spend €10K to host an open model on your own GPUs and pay €0/token forever, or pay €30K/month to a frontier lab in perpetuity. The crossover used to be three years. It is now three months.


Fine Tuning LLM Practical Implementation and Adaptation: Domain Specific Model Training, Optimization Strategies, and Responsible Deployment (The Applied Agentic AI Engineering Series)

Fine Tuning LLM Practical Implementation and Adaptation: Domain Specific Model Training, Optimization Strategies, and Responsible Deployment (The Applied Agentic AI Engineering Series)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

2. What This Means for the Frankenstein Thesis

In February 2026, this site published Rent-and-Distill — the playbook by which a Chinese cohort siphoned reasoning traces from closed Western models, ran fine-tuning on rented U.S. compute, and shipped open-weight Frankenstein models at €10–20M per launch.

The April releases are the empirical proof.

DeepSeek V4 was not built by a lab with thousands of PhDs. It was built by a lab with engineering discipline, access to open base weights, and a distillation pipeline. The gap to a model built by Anthropic with thousands of PhDs is single digits.

The deeper reading: distillation is not just theoretically effective. It is now demonstrably scalable to the frontier.

“The moat is not the weights. The moat is whatever you refuse to show.” That was the closing line of Rent-and-Distill in February. Six weeks later, the open-weight benchmark gap closed by another two points.


HIWONDER Robot Car with ChatGPT Large AI Models, 3D Depth Camera Mecanum-Wheel Chassis ROS2-HUMBLE Lidar SLAM Mapping Navigation Autonomous Driving, MentorPi M1 Standard Kit Without Raspberry Pi

HIWONDER Robot Car with ChatGPT Large AI Models, 3D Depth Camera Mecanum-Wheel Chassis ROS2-HUMBLE Lidar SLAM Mapping Navigation Autonomous Driving, MentorPi M1 Standard Kit Without Raspberry Pi

For Raspberry Pi 5 & ROS2 Robot Car. MentorPi M1 smart AI robot car kit is powered by…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

3. The Three Strategic Shifts This Forces

Shift 1: Inference economics flip. When a 70B-class model runs on a single H200 node at $4/hour, per-token cost drops below any API. Every token-heavy workflow — call summarization, document extraction, code review, ticket triage — has different unit economics in May than it did in March.

Shift 2: Model selection becomes a portfolio question. No serious enterprise will run on one model. Closed APIs for the hardest 5% of queries. Open weights for the 95% the open models now handle as well as closed ones did six months ago. Routing logic — not model quality — becomes the new differentiator.

Shift 3: Sovereignty and licensing matter again. Llama 4’s license still excludes companies above a certain MAU threshold. DeepSeek V4 is unrestricted but Chinese-origin. Mistral Small 4 is Apache-2. The license is now a procurement criterion that matters as much as the benchmark.


Cloud Ninjas AI Workstation Designed for Avid Media Composer Ryzen Threadripper 9960X 4.2GHz 24 Core Geforce RTX PRO 4000 Blackwell SFF 24GB 128GB ECC Reg DDR5 1TB 2TB M.2 NVMe 1600W PSU

Cloud Ninjas AI Workstation Designed for Avid Media Composer Ryzen Threadripper 9960X 4.2GHz 24 Core Geforce RTX PRO 4000 Blackwell SFF 24GB 128GB ECC Reg DDR5 1TB 2TB M.2 NVMe 1600W PSU

Ryzen Threadripper 9960X 4.2GHz (Up To 5.4GHz Turbo) 24 Core

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

4. What Closed Frontier Labs Will Do Next

Three predictions for the next two quarters.

Prediction 1: The closed labs raise the bar. Expect GPT-6 / Claude 5 / Gemini 3 in summer 2026, with capability gaps re-opened to double digits — for six months. Then the open weights catch up again. This is now the rhythm.

Prediction 2: The closed labs move up the stack. API models are commoditizing. The defensible product is the agent platform — long memory, tool integration, organizational context. Watch Anthropic, OpenAI, and Google ship platform offerings that make the underlying model less important. Google already did, on 2026-04-22, with the $750M Gemini Enterprise Agent Platform launch.

Prediction 3: The closed labs lobby for compute restrictions on open-weight training. The Remote Access Security Act blocked one cloud loophole. Expect the next regulatory front to be FLOP thresholds for open releases — which only the closed labs would benefit from.


5. The Quiet Winner

While the open-weight race accelerates, one player wins quietly: NVIDIA.

A 1T-parameter open model needs hardware to run. Self-hosted inference at enterprise scale means H200s, B200s, and the entire datacenter retrofit. The same Chinese labs that built Frankenstein models on rented U.S. compute are now selling the inference dependency to every Western enterprise that downloads the weights.

This is the second loop the regulators did not anticipate. RASA (January 2026) closed the training loophole. The April releases just reopened the inference one — except this time, NVIDIA is the beneficiary, not the threat.


What Leaders Should Do This Quarter

1. If you spend more than €1M/yr on closed APIs: run a hosted open-weight pilot on the next refresh. The crossover math is real.

2. If you sell an AI product: assume your moat is not your model. Build the data, the workflow, and the trust layer. The weights underneath you will commoditize.

3. If you set procurement policy: treat license terms (MAU caps, country-of-origin, redistribution rights) as a first-class procurement criterion, not a footnote.

4. If you set national policy: RASA needs a sequel. The next loophole is in the inference layer, not the training one.


The Strategic Read

April 2026 was the month the open-weight curve crossed the closed-weight curve on the metrics that matter to enterprises. It will be remembered as the inflection point.

The closed labs are not finished. They will pull ahead again, briefly, with the next release. But the structural fact is now established: every frontier capability shipped by a closed lab has an 18-month half-life before it is replicated in open weights.

The strategic question for any enterprise is no longer “which closed API do we sign?” It is “what part of our stack would we still pay for if the model underneath was free?”


The benchmark gap is in the single digits. The pricing gap is not. That is the arbitrage.


About the Author

Thorsten Meyer is a Munich-based futurist, post-labor economist, and recipient of OpenAI’s 10 Billion Token Award. He spent two decades managing €1B+ portfolios in enterprise ICT before deciding that writing about the transition was more useful than managing quarterly slides through it. More at ThorstenMeyerAI.com.


  • Your AI Vendor’s AI Vendor — File 0426 — agent supply chain compromise (Vercel × Context AI)
  • This file — File 0427 — the April 2026 open-weight inflection
  • AI-Washed — File 0428 — the 47.9% / 9% layoff narrative gap
  • The 27% Problem — File 0429 — Anthropic’s enterprise lead and Google’s $750M check
  • The Bubble Is Not in Valuations — File 0430 — the productivity gap
  • The Agent Trap — File 0431 — why 90% of AI “launches” are infrastructure liars

Sources

  • llm-stats, AI Updates Today: Latest AI Model Releases (2026-04)
  • DeepSeek, V4 Technical Report (2026-04-23)
  • Alibaba Qwen Team, Qwen 3.6-35B-A3B Release Notes (2026-04-16)
  • Meta, Llama 4 Model Card: Scout & Maverick (2026-04)
  • Sebastian Raschka, A Dream of Spring for Open-Weight LLMs (2026-02)
  • Lushbinary, Best Open-Source LLMs April 2026 (2026-04)
  • BuildFastWithAI, Best AI Models April 2026: Ranked by Benchmarks (2026-04)
  • Renovate, Chinese AI Models in April 2026: DeepSeek V4, Qwen 3.5, Kimi K2.5, GLM-5 (2026-04)
You May Also Like

Europe’s AI Sovereignty: Transforming Industrial and Defence Landscapes through Secure, High‑Performance Infrastructure

Executive Summary Europe is undergoing a rapid shift in how it builds, controls…

Artificial Intelligence Dramatically Reshaping The Software Industry

Artificial intelligence is dramatically reshaping the software industry, creating clear winners and…

The 2.3 kW GPU Era: NVIDIA’s Rubin Ultra and the Coming Thermodynamic Revolution in AI Infrastructure

In 2027, NVIDIA’s “Rubin Ultra” platform may mark not just a generational…

Agentic AI Moves From Hype to Hard ROI: What Cybersecurity and Telecom Just Proved

TL;DR The new playbook: agentic systems over unified telemetry The breakthrough in…