Source: Artificial Analysis β€” Composite average pass@1 across SWE-Bench-Pro-Hard-AA, Terminal-Bench v2, and SWE-Atlas-QnA.
Auto-updated weekly Β· Last update: 2026-06-01 02:01 UTC

#AgentModelProviderScore
1Claude CodeOpus 4.7max67
2CodexGPT-5.5xhigh65
3Cursor CLIComposer 2.5 Fast63
4Cursor CLIOpus 4.7medium61
5CodexGPT-5.5medium60
6Claude CodeOpus 4.7medium60
7Cursor CLIGPT-5.5medium58
8Claude CodeGLM-5.153
9Claude CodeKimi K2.650
10Claude CodeDeepSeek V4 Prohigh50
11Gemini CLIGemini 3.1 Prohigh43

⏱ Time per Task

Mean wall clock time per task (lower is better)

#AgentWall Time
1Claude Code - Opus 4.7 (medium) (Anthropic)5.8m
2Cursor CLI - GPT-5.5 (medium) (Cursor)6.2m
3Cursor CLI - Composer 2.5 Fast (Cursor)6.7m
4Codex - GPT-5.5 (medium) (OpenAI)7.1m
5Gemini CLI - Gemini 3.1 Pro (high) (Gemini)7.6m
6Cursor CLI - Opus 4.7 (medium) (Cursor)7.8m
7Codex - GPT-5.5 (xhigh) (OpenAI)8.7m
8Claude Code - Opus 4.7 (max) (Anthropic)13.8m
9Claude Code - DeepSeek V4 Pro (high) (DeepSeek)18.0m
10Claude Code - GLM-5.1 (FriendliAI)21.6m
11Claude Code - Kimi K2.6 (Moonshot AI)41.5m

πŸ’° Cost per Task

Mean API cost per task in USD (lower is better)

#AgentCost (USD)
1Claude Code - DeepSeek V4 Pro (high) (DeepSeek)$0.35
2Cursor CLI - Composer 2.5 Fast (Cursor)$0.44
3Claude Code - Kimi K2.6 (Moonshot AI)$0.76
4Claude Code - Opus 4.7 (medium) (Anthropic)$1.24
5Cursor CLI - Opus 4.7 (medium) (Cursor)$1.47
6Gemini CLI - Gemini 3.1 Pro (high) (Gemini)$1.60
7Cursor CLI - GPT-5.5 (medium) (Cursor)$1.61
8Codex - GPT-5.5 (medium) (OpenAI)$2.21
9Claude Code - GLM-5.1 (FriendliAI)$2.26
10Claude Code - Opus 4.7 (max) (Anthropic)$4.14
11Codex - GPT-5.5 (xhigh) (OpenAI)$4.33

About the Benchmarks

  • SWE-Bench-Pro-Hard-AA β€” Code generation, 150 questions (Scale AI)
  • Terminal-Bench v2 β€” Agentic terminal use, 84 questions (Laude Institute)
  • SWE-Atlas-QnA β€” Technical Q&A, 124 questions (Scale AI)

The index represents the average pass@1 across 3 runs of each benchmark.


Data scraped weekly by an AI Agent. For the latest results, visit the original page.