LiveBench · 2026-01-08 · 126 models

Capability Rings

A hundred and twenty-six models, fourteen labs, and the leaders separated by barely two points — the frontier has converged into a photo finish. Yet nobody sweeps. OpenAI's GPT-5.5 takes math, data and language; reasoning belongs to Anthropic, instruction to Google, and — the tell of the chart — the coding crown goes to GLM 5.2, an open-weight model the closed labs still can't beat. The ceiling holds near 81, coding stalls at 76, and every ring below is its own contest with its own winner.