Code Arena🏆Overall

View overall rankings across AI models on agentic coding tasks involving multi-step reasoning and tool use.

Mar 26, 2026
214,231 votes
57 models
Rank Spread
1
12
Anthropic
Anthropic · Proprietary
1549+11/-11
4,264$5 / $251M
2
12
Anthropic
Anthropic · Proprietary
1545+12/-12
3,495$5 / $251M
3
33
Anthropic
Anthropic · Proprietary
1523+9/-9
6,391$3 / $151M
4
44
Anthropic
1491+7/-7
13,247$5 / $25200K
5
57
Anthropic
Anthropic · Proprietary
1465+7/-7
13,559$5 / $25200K
6
514
OpenAI · Proprietary
1457+17/-17
1,488N/AN/A
7
511
Google · Proprietary
1455+10/-10
4,733$2 / $121M
8
615
Z.ai · MIT
1445+10/-10
4,265$1 / $3.20202.8K
9
615
Z.ai · MIT
1439+10/-10
4,877$0.39 / $1.75202.8K
10
715
Google · Proprietary
1438+7/-7
17,152$2 / $121M
11
615
Xiaomi · Proprietary
1437+13/-13
2,209$1 / $31M
12
715
Google · Proprietary
1437+7/-7
13,266$0.50 / $31M
13
615
Minimax
MiniMax · Proprietary
1435+14/-14
2,133$0.30 / $1.20204.8K
14
815
MoonshotAI
Moonshot · Modified MIT
1430+8/-8
6,421$0.60 / $3N/A
15
718
OpenAI · Proprietary
1428+16/-16
1,575N/AN/A
16
1523
MoonshotAI
Moonshot · Modified MIT
1408+11/-11
3,609$0.45 / $2.22262.1K
17
1524
OpenAI · Proprietary
1407+12/-12
2,973$1.75 / $14400K
18
1625
Minimax
MiniMax · Modified MIT
1403+9/-9
6,120$0.20 / $1.17196.6K
19
1528
OpenAI · Proprietary
1403+16/-16
1,460$1.75 / $14400K
20
1628
OpenAI · Proprietary
1392+13/-13
3,752$1.25 / $10400K
21
1628
Minimax
MiniMax · MIT
1392+8/-8
9,272$0.27 / $0.95196.6K
22
1628
1392+7/-7
11,486$0.50 / $31M
23
1628
OpenAI · Proprietary
1390+9/-9
6,121$1.25 / $10400K
24
1828
Anthropic
1389+6/-6
15,905$3 / $15200K
25
1728
Alibaba · Apache 2.0
1387+9/-9
4,912$0.39 / $2.34262.1K
26
1928
Anthropic
Anthropic · Proprietary
1386+6/-6
17,947$3 / $15200K
27
1929
Anthropic
Anthropic · Proprietary
1384+9/-9
8,568$15 / $75200K
28
1930
1378+13/-13
2,379$2 / $62M
29
2731
DeepSeek · MIT
1369+8/-8
7,681$0.26 / $0.38163.8K
30
2832
Alibaba · Apache 2.0
1364+10/-10
3,632$0.26 / $2.08262.1K
31
2934
Z.ai · MIT
1353+9/-9
8,345$0.39 / $1.90204.8K
32
3036
Alibaba · Apache 2.0
1346+11/-11
3,387$0.20 / $1.56262.1K
33
3138
OpenAI · Proprietary
1339+7/-7
12,865$1.25 / $10400K
34
3138
1337+8/-8
6,731$0.09 / $0.29262.1K
35
3238
OpenAI · Proprietary
1336+8/-8
7,951$1.75 / $14400K
36
3338
MoonshotAI
Moonshot · Modified MIT
1328+6/-6
14,601$1.15 / $8262.1K
37
3238
OpenAI · Proprietary
1328+9/-9
6,221$1.25 / $10400K
38
3338
DeepSeek · MIT
1325+8/-8
9,111$0.26 / $0.38163.8K
39
3941
Anthropic
Anthropic · Proprietary
1309+6/-6
15,957$1 / $5200K
40
3942
Minimax
MiniMax · Apache 2.0
1303+9/-9
8,396$0.26 / $1196.6K
41
3943
1300+14/-14
2,095$0.09 / $0.29262.1K
42
4043
DeepSeek · MIT
1285+11/-11
4,868$0.27 / $0.41163.8K
43
4143
Alibaba · Apache 2.0
1280+6/-6
15,368$0.40 / $1.60262.1K
44
4449
Kwai
KwaiKAT · Proprietary
1257+15/-15
1,883$0.21 / $0.83256K
45
4450
Alibaba · Apache 2.0
1248+16/-16
1,813$0.16 / $1.30262.1K
46
4450
Google · Proprietary
1242+10/-10
4,579$0.25 / $1.501M
47
4451
OpenAI · Proprietary
1238+17/-17
1,443$0.25 / $2400K
48
4451
Alibaba · Proprietary
1237+17/-17
1,562N/AN/A
49
4451
xAI · Proprietary
1233+9/-9
6,916$0.20 / $0.502M
50
4554
Mistral · Apache 2.0
1220+20/-20
1,031$0.50 / $1.50N/A
51
4754
xAI · Proprietary
1206+20/-20
1,209$0.20 / $0.50N/A
52
5054
Google · Proprietary
1202+13/-13
3,295$1.25 / $101M
53
5054
Mistral · Modified MIT
1198+17/-17
1,579N/AN/A
54
5055
Inception AI · Proprietary
1183+21/-21
1,107$0.25 / $0.75128K
55
5456
xAI · Proprietary
1147+23/-23
933$0.20 / $0.502M
56
5556
xAI · Proprietary
1138+22/-22
983$0.20 / $1.50256K
57
5757
Mistral · Proprietary
1090+23/-23
993$0.40 / $2128K

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)