Review · May 10, 2026 · cold-start backfill

The April 2026 frontier-model wave: DeepSeek-V4, Qwen3.5-Omni, GPT-5.5, Claude Opus 4.7 & four more.

8 launches · 8 labs · auto-generated · 30-day cold-start window

TL;DR

Focus

The first cold-start review covers the densest stretch of frontier-model launches we have on record: eight new flagships landed between April 12 and April 30, 2026, from labs spread across three continents. DeepSeek-V4 (Apr 24) and Qwen3.5-Omni (Apr 17) shipped with full arXiv technical reports; OpenAI's GPT-5.5 (Apr 23) shipped with a Preparedness-framework system card; Anthropic's Claude Opus 4.7 (Apr 16), Moonshot's Kimi K2.6 (Apr 20), MiniMax M2.7 (Apr 12 open weights), xAI's Grok 4.3 (Apr 30), and Mistral Medium 3.5 (Apr 30) shipped as launch posts or model cards. Three threads cut across all eight: long-context as table stakes (1M tokens at DeepSeek, Grok, and Qwen; 256k–262k at MiniMax, Mistral, Moonshot), agentic coding as the unified evaluation target (SWE-Bench Verified / Pro and Terminal-Bench 2.0 are now the universal scoreboard), and open-weights pricing compression from the Chinese labs.

Competitiveness

Claude Opus 4.7 holds the SWE-Bench Verified frontier at the time of writing, with DeepSeek-V4-Pro-Max trailing by 0.2 points (80.6%) at a small fraction of the inference cost, and Mistral Medium 3.5 closing to 77.6% on a 128B dense model. GPT-5.5 leads OpenAI-internal Preparedness-style biology and cyber evals. Grok 4.3 is the price/performance shock — a roughly 40% input / 60% output price cut versus Grok 4.20 lifts it onto the Intelligence Index just above Muse Spark and Claude Sonnet 4.6. On audio and audio-visual understanding, Qwen3.5-Omni-plus passes Gemini 3.1 Pro in 215-task aggregates. The closed Western frontier (Opus 4.7, GPT-5.5, Gemini 3.1 Pro from February, Grok 4.3) still leads on the hardest reasoning and agentic benchmarks; the open Chinese frontier (DeepSeek-V4, Kimi K2.6, MiniMax M2.7) has converged onto the same capability ceiling on agentic coding at roughly a third of the inference cost.

New frontier releases

Eight flagship models shipped in 19 days: MiniMax M2.7 open weights (Apr 12), Claude Opus 4.7 (Apr 16), Qwen3.5-Omni (Apr 17 arXiv), Kimi K2.6 (Apr 20), GPT-5.5 (Apr 23), DeepSeek-V4 (Apr 24), Grok 4.3 and Mistral Medium 3.5 (both Apr 30).

Alibaba (Qwen)

Qwen3.5-Omni Technical Report

Tier 1 · Technical Report arXiv:2604.15804 2026-04-17 Omnimodal · Audio · MoE · Streaming TTS

Overview

Architecture

Pre-training

Post-training

Evaluation & Results

Availability

Anthropic

Introducing Claude Opus 4.7

Tier 1 · Launch Post + System Card anthropic.com 2026-04-16 Coding · Agentic · Vision · Safety

Overview

Claude Opus 4.7 benchmark comparison vs Opus 4.6 and prior frontier models
Headline benchmark comparison chart (from Anthropic's launch post). Source: anthropic.com/news/claude-opus-4-7.

Architecture

Post-training

Evaluation & Results

Safety & Limitations

Availability

DeepSeek

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

Tier 1 · Technical Report Hugging Face · DeepSeek-AI 2026-04-24 Long-context · Hybrid attention · MoE · MIT-licensed

Overview

Architecture

Pre-training

Post-training

Evaluation & Results

BenchmarkDeepSeek-V4-Pro-MaxReference
SWE-Bench Verified80.6%Trails Claude Opus 4.6 by 0.2 points
SWE-Bench Pro55.4%Frontier band
LiveCodeBench93.5Frontier band
MMLU-Pro87.5%Strong general
GPQA Diamond90.1%Strong reasoning

Availability

MiniMax

MiniMax M2.7: Early Echoes of Self-Evolution

Tier 1 · Launch Post + Model Card minimax.io 2026-04-12 Agentic · Open-weights · Self-improvement

Overview

Architecture

Post-training

Evaluation & Results

Availability

Mistral AI

Mistral Medium 3.5 + Vibe Remote Agents

Tier 1 · Launch Post mistral.ai 2026-04-30 Coding · Agentic · Open-weights · Merged-flagship

Overview

Architecture

Post-training

Evaluation & Results

Availability

Moonshot AI

Kimi K2.6: Long-Horizon Coding & Agent Swarms

Tier 1 · Launch Post + Model Card kimi.com / Hugging Face 2026-04-20 Coding · Agentic · Open-weights · Swarm

Overview

Architecture

Post-training

Evaluation & Results

Availability

OpenAI

GPT-5.5 System Card

Tier 1 · System Card + Launch Post openai.com 2026-04-23 General · Reasoning · Coding · Safety

Overview

Architecture

Post-training

Evaluation & Results

Safety & Limitations

Availability

xAI

Grok 4.3

Tier 1 · Launch + API GA x.ai / docs.x.ai 2026-04-30 Multimodal · Long-context · Pricing · Agentic

Overview

Architecture

Evaluation & Results

Availability