據說 (金融時報, Published Aug 13 2025)

本帖於 2025-08-30 12:29:35 時間, 由普通用戶 uptrend 編輯

https://www.ft.com/content/eb984646-6320-4bfe-a78d-a1da2274b092

 Chinese artificial intelligence company DeepSeek delayed the release of its new model after failing to train it using Huawei’s chips, highlighting the limits of Beijing’s push to replace US technology.

DeepSeek was encouraged by authorities to adopt Huawei’s Ascend processor rather than use Nvidia’s systems after releasing its R1 model in January, according to three people familiar with the matter.

But the Chinese start-up encountered persistent technical issues during its R2 training process using Ascend chips, prompting it to use Nvidia chips for training and Huawei’s for inference, said the people.

The issues were the main reason the model’s launch was delayed from May, said a person with knowledge of the situation, causing it to lose ground to rivals.

Training involves the model learning from a large dataset, while inference refers to the step of using a trained model to make predictions or generate a response, such as a chatbot query.

DeepSeek’s difficulties show how Chinese chips still lag behind their US rivals for critical tasks, highlighting the challenges facing China’s drive to be technologically self-sufficient.

The Financial Times this week reported that Beijing has demanded that Chinese tech companies justify their orders of Nvidia’s H20, in a move to encourage them to promote alternatives made by Huawei and Cambricon.

Industry insiders have said the Chinese chips suffer from stability issues, slower inter-chip connectivity and inferior software compared with Nvidia’s products.

Huawei sent a team of engineers to DeepSeek’s office to help the company use its AI chip to develop the R2 model, according to two people. Yet despite having the team on site, DeepSeek could not conduct a successful training run on the Ascend chip, said the people.

DeepSeek was still working with Huawei to make the model compatible with Ascend for inference, the people said.

Founder Liang Wenfeng had said internally he was dissatisfied with R2’s progress and had been pushing to spend more time to build an advanced model that can sustain the company’s lead in the AI field, they said.

The R2 launch was also delayed because of longer-than-expected data labelling for its updated model, another person added. Chinese media reports have suggested that the model may be released as soon as in the coming weeks.

“Models are commodities that can be easily swapped out,” said Ritwik Gupta, an AI researcher at the University of California, Berkeley. “A lot of developers are using Alibaba’s Qwen3, which is powerful and flexible.”

Gupta noted that Qwen3 adopted DeepSeek’s core concepts, such as its training algorithm that makes the model capable of reasoning, but made them more efficient to use.

Gupta, who tracks Huawei’s AI ecosystem, said the company was facing “growing pains” in using Ascend for training, though he expects the Chinese national champion to adapt eventually.

“Just because we’re not seeing leading models trained on Huawei today doesn’t mean it won’t happen in the future. It’s a matter of time,” he said.

Nvidia, a chipmaker at the centre of a geopolitical battle between Beijing and Washington, recently agreed to give the US government a cut of its revenues in China in order to resume sales of its H20 chips to the country.

“Developers will play a crucial role in building the winning AI ecosystem,” said Nvidia about Chinese companies using its chips. “Surrendering entire markets and developers would only hurt American economic and national security.”

DeepSeek and Huawei did not respond to a request for comment.

所有跟帖: 

用一套全新的係統做訓練肯定是會有遲延的 -wuming2007- 給 wuming2007 發送悄悄話 (0 bytes) () 08/30/2025 postreply 12:41:55

不過梁文鋒自己說效果超過想象 -wuming2007- 給 wuming2007 發送悄悄話 (0 bytes) () 08/30/2025 postreply 12:43:20

如果華為趕不上,那麽中國AI發展會陷入困境 -dividend_growth- 給 dividend_growth 發送悄悄話 dividend_growth 的博客首頁 (36 bytes) () 08/30/2025 postreply 12:44:18

英偉達的蕊片最終被替代隻是時間問題,雖然不一定是今年甚至明年 -大海的聲音- 給 大海的聲音 發送悄悄話 (0 bytes) () 08/30/2025 postreply 12:46:21

美國限製高端AI蕊片出口中國是一個戰略性的失誤 -大海的聲音- 給 大海的聲音 發送悄悄話 (0 bytes) () 08/30/2025 postreply 12:48:59

不限製更糟糕,就像高端引擎 -無為其所不為- 給 無為其所不為 發送悄悄話 (0 bytes) () 08/30/2025 postreply 13:46:01

無論如何都是贏 -Pilot007- 給 Pilot007 發送悄悄話 (0 bytes) () 08/30/2025 postreply 13:57:37

是限製方式有問題。又想限製,又想賺錢,每次限製跟擠牙膏一樣。給了中國壯大的機會。 -破界- 給 破界 發送悄悄話 (0 bytes) () 08/31/2025 postreply 02:27:00

八月二十一號,v3.1不是出來了嗎? -zhoufang- 給 zhoufang 發送悄悄話 (60397 bytes) () 08/30/2025 postreply 14:03:48

v3.1不怎麽樣。上下文窗口太小,實用性低。要是比測試,gpt-4都有勝過gpt-5的地方,更不用說可以刷題訓練模型。 -uptrend- 給 uptrend 發送悄悄話 uptrend 的博客首頁 (0 bytes) () 08/30/2025 postreply 14:47:53

What’s your point? Talking delay or performance? -orleans- 給 orleans 發送悄悄話 (0 bytes) () 08/30/2025 postreply 14:59:52

請您先登陸,再發跟帖!