Anthropic ships Claude Opus 4 — tops SWE-bench, adds extended thinking
Posted 2026-05-23 · Review by 2026-07-23
Claude Opus 4 takes #1 on SWE-bench Verified. Extended thinking and parallel tool use make it the strongest coding agent model available.
Signal, not noise — the living guide to AI tools, agents and stacks.
Posted 2026-05-23 · Review by 2026-07-23
Claude Opus 4 takes #1 on SWE-bench Verified. Extended thinking and parallel tool use make it the strongest coding agent model available.
Posted 2026-04-26 · Review by 2026-09-26
Sora 2 is officially deprecated. The API shuts down 24 September 2026. OpenAI is refocusing on enterprise video tooling.
The key players — the dominant labs and flagship models.
Anthropic
Terminal-native agent that consistently ranks #1 across independent benchmarks; works in terminal, IDE and browser.
Anthropic
Desktop agent for macOS and Windows that edits files, organises folders, synthesises research and runs scheduled multi-step tasks in a privacy-first sandbox.
Anthropic
The current flagship (with Sonnet 4.6 and Haiku 4.5); at or near the top for complex multi-file coding and long-context technical work, and the most natural for prose.
Leads most published reasoning benchmarks and has the cheapest output among the majors; paired with Gemini Spark for long cloud tasks.
Alibaba
Currently tops the Artificial Analysis leaderboard.
Gemini 3 Pro Image; leads the image-arena leaderboard, with the best multilingual text rendering, free in the Gemini app.
ByteDance
Currently tops the Artificial Analysis leaderboard.
The dead and the zombies — tools that died or quietly stopped mattering.
acquired
Meta acquired Limitless (Dec 2025); Amazon acquired Bee (mid-2025).
OpenAI
OpenAI stepped back from the consumer video race.