AI Benchmarks

Compare Gemini 3’s published results across reasoning, coding, multimodal, retrieval, and long-context suites. All values mirror the official DeepMind scorecards.

8

Benchmarks

8

Categories

8

Capabilities

Filter benchmarks

Loading benchmark telemetry…

Comparison matrix

BenchmarkModelScoreToolsSource

Humanity's Last Exam

accuracy

Gemini 3 Deep Think41%NoLink

Humanity's Last Exam

accuracy

Gemini 3 Pro37.5%NoLink

Humanity's Last Exam

accuracy

GPT-5.126.5%NoLink

GPQA Diamond

accuracy

Gemini 3 Deep Think93.8%NoLink

GPQA Diamond

accuracy

Gemini 3 Pro91.9%NoLink

GPQA Diamond

accuracy

GPT-5.188.1%NoLink

ARC-AGI-2

accuracy

Gemini 3 Deep Think45.1%YesLink

ARC-AGI-2

accuracy

Gemini 3 Pro31.1%NoLink