Google Gemini 3.1 Pro: 77.1% on ARC-AGI-2, Topping Claude and GPT-5

Google released Gemini 3.1 Pro, scoring 77.1% on the ARC-AGI-2 reasoning benchmark — more than double its predecessor’s 31.1%, and ahead of Claude Opus 4.6 (68.8%) and GPT-5.2 (52.9%). It also tops benchmarks for science, competitive coding, MCP use, and agentic search. Pricing is identical to 3 Pro with the same 1M token context window, available now via Gemini API, Vertex AI, the Gemini app, and NotebookLM. Via The Rundown AI. Read more