R1擅長的推理模型，領先O1但非碾壓。最出色的還是用RL代替人工做微調，再度證明：AI勝過人力。

來源: uptrend 於 2025-01-27 15:03:26 [博客] [舊帖] [給我悄悄話] 本文已被閱讀：次

WENXUECITY.COM does not represent or guarantee the truthfulness, accuracy, or reliability of any of communications posted by other users.