據說 (金融時報， Published Aug 13 2025）

uptrend · 2025-08-30 12:23:13Z

據說 (金融時報， Published Aug 13 2025）簡介

來源: uptrend 於 2025-08-30 12:23:13 [檔案] [博客] [舊帖] [給我悄悄話] 閱讀數 : (4132 bytes)

本帖於 2025-08-30 12:29:35 時間, 由普通用戶 uptrend 編輯

https://www.ft.com/content/eb984646-6320-4bfe-a78d-a1da2274b092

Chinese artificial intelligence company DeepSeek delayed the release of its new model after failing to train it using Huawei’s chips, highlighting the limits of Beijing’s push to replace US technology.

DeepSeek was encouraged by authorities to adopt Huawei’s Ascend processor rather than use Nvidia’s systems after releasing its R1 model in January, according to three people familiar with the matter.

But the Chinese start-up encountered persistent technical issues during its R2 training process using Ascend chips, prompting it to use Nvidia chips for training and Huawei’s for inference, said the people.

The issues were the main reason the model’s launch was delayed from May, said a person with knowledge of the situation, causing it to lose ground to rivals.

Training involves the model learning from a large dataset, while inference refers to the step of using a trained model to make predictions or generate a response, such as a chatbot query.

DeepSeek’s difficulties show how Chinese chips still lag behind their US rivals for critical tasks, highlighting the challenges facing China’s drive to be technologically self-sufficient.

The Financial Times this week reported that Beijing has demanded that Chinese tech companies justify their orders of Nvidia’s H20, in a move to encourage them to promote alternatives made by Huawei and Cambricon.

Industry insiders have said the Chinese chips suffer from stability issues, slower inter-chip connectivity and inferior software compared with Nvidia’s products.

Huawei sent a team of engineers to DeepSeek’s office to help the company use its AI chip to develop the R2 model, according to two people. Yet despite having the team on site, DeepSeek could not conduct a successful training run on the Ascend chip, said the people.

DeepSeek was still working with Huawei to make the model compatible with Ascend for inference, the people said.

Founder Liang Wenfeng had said internally he was dissatisfied with R2’s progress and had been pushing to spend more time to build an advanced model that can sustain the company’s lead in the AI field, they said.

The R2 launch was also delayed because of longer-than-expected data labelling for its updated model, another person added. Chinese media reports have suggested that the model may be released as soon as in the coming weeks.

“Models are commodities that can be easily swapped out,” said Ritwik Gupta, an AI researcher at the University of California, Berkeley. “A lot of developers are using Alibaba’s Qwen3, which is powerful and flexible.”

Gupta noted that Qwen3 adopted DeepSeek’s core concepts, such as its training algorithm that makes the model capable of reasoning, but made them more efficient to use.

Gupta, who tracks Huawei’s AI ecosystem, said the company was facing “growing pains” in using Ascend for training, though he expects the Chinese national champion to adapt eventually.

“Just because we’re not seeing leading models trained on Huawei today doesn’t mean it won’t happen in the future. It’s a matter of time,” he said.

Nvidia, a chipmaker at the centre of a geopolitical battle between Beijing and Washington, recently agreed to give the US government a cut of its revenues in China in order to resume sales of its H20 chips to the country.

“Developers will play a crucial role in building the winning AI ecosystem,” said Nvidia about Chinese companies using its chips. “Surrendering entire markets and developers would only hurt American economic and national security.”

DeepSeek and Huawei did not respond to a request for comment.