Compare Gemini 3’s published results across reasoning, coding, multimodal, retrieval, and long-context suites. All values mirror the official DeepMind scorecards.
8
Benchmarks
8
Categories
8
Capabilities
Loading benchmark telemetry…
| Benchmark | Model | Score | Tools | Source |
|---|---|---|---|---|
Humanity's Last Exam accuracy | Gemini 3 Deep Think | 41% | No | Link |
Humanity's Last Exam accuracy | Gemini 3 Pro | 37.5% | No | Link |
Humanity's Last Exam accuracy | GPT-5.1 | 26.5% | No | Link |
GPQA Diamond accuracy | Gemini 3 Deep Think | 93.8% | No | Link |
GPQA Diamond accuracy | Gemini 3 Pro | 91.9% | No | Link |
GPQA Diamond accuracy | GPT-5.1 | 88.1% | No | Link |
ARC-AGI-2 accuracy | Gemini 3 Deep Think | 45.1% | Yes | Link |
ARC-AGI-2 accuracy | Gemini 3 Pro | 31.1% | No | Link |