"I came up with this whole idea while optimizing wllama to run deepseek-r1-distilled-qwen-1.5B faster. So the bigger deepseek helping optimize code to run the smaller deepseek."
他自己說的
所有跟帖:
•
這個和gpu 指令優化沒關係。
-BeyondWind-
♂
(0 bytes)
()
01/29/2025 postreply
17:27:18
•
the bigger deepseek helping optimize code to run the smaller
-cn_abcd-
♂
(0 bytes)
()
01/29/2025 postreply
17:34:54