OpenAI o3和DeepSeek R1在數學benchmarks的成績比較：

來源: 大觀園的賈探春於 2025-02-02 00:40:34 [博客] [舊帖] [給我悄悄話] 本文已被閱讀：次

ChatGPT的回答：

Here's the comparison of math-related benchmark performance between OpenAI o3 and DeepSeek R1, excluding any benchmarks where one of the models’ numbers is unknown:

AIME 2024:
- OpenAI o3: 96.7% accuracy
- DeepSeek R1: 79.8%
Graduate-Level Google-Proof Q&A (GPQA) Benchmark:
- OpenAI o3: 87.7%
- DeepSeek R1: 71.5%

The MATH-500 benchmark data is only available for DeepSeek R1, so no comparison is made for that benchmark.

文學城簡介
廣告服務
聯係我們
招聘信息
注冊筆名
申請版主
收藏文學城

WENXUECITY.COM does not represent or guarantee the truthfulness, accuracy, or reliability of any of communications posted by other users.