A dialogue with chatGPT about Deepseek.

QQ: Who are you?

 

ChatGPT: I'm ChatGPT, an Al designed to assist with answering questions, brainstorming ideas, and having conversations. What's on your mind?

 

QQ: How do you think about deepseek? Does it copy you in any sense?

$

ChatGPT: DeepSeek is another Al model developed independently, and while it may share similarities with me in terms of being a large language model, it isn't a direct copy. Many modern Al models are built using similar underlying techniques, such as transformer architectures and large-scale training on diverse datasets, but the implementation details, training data, and fine-tuning oaches can vary sianificantlv between models.

 

ChatGPT: Do you have any specific concerns or comparisons in mind?

 

QQ: Want to know some technique detail comparison of deepseek and you.

 

 ChatGPT: Certainly! Let's delve into a technical comparison between DeepSeek and ChatGPT, focusing on their

architectures, training methodologies, and performance characteristics.

 

Architecture and Model Design

  • ??ChatGPT: Developed by OpenAl, ChatGPT is based on the GPT (Generative Pre-trained Transformer)
    architecture. It utilizes a dense transformer model where all parameters are active during inference, leading to substantial computational resource consumption.
  • ??DeepSeek: DeepSeek employs a
    "Mixture-of-Experts" (MoE)
    architecture, which activates only a subset of its parameters during inference. This design enhances computational efficiency by engaging only the necessary "experts" for a given task, reducing resource usage and improving inference speed.

 

Training Methodology

  • ??ChatGPT: The training process involves large-scale datasets comprising internet text, followed by supervised fine-tuning and reinforcement learning from human feedback to align the model's outputs with human preferences.
  • ??DeepSeek: DeepSeek's training strategy includes pretraining on a diverse multilingual corpus, with a higher emphasis on mathematical and programming data. The model undergoes supervised fine-tuning using both human-generated and model-generated data, particularly in reasoning domains. Additionally, DeepSeek employs reinforcement learning techniques to further refine its performance.

 

Performance and Efficiency

  • ??ChatGPT: While delivering high-quality responses, ChatGPT's dense architecture results in higher computational costs during inference, which can impact scalability and energy efficiency.
  • ??DeepSeek: The MoE architecture of DeepSeek allows for significant computational savings, as only relevant experts are activated per task. This leads to reduced energy consumption and lower operational costs. Notably, DeepSeek has achieved performance on par with leading models like OpenAl's GPT series, but with a fraction of the training and inference costs.

 

Open-Source Availability

  • ??ChatGPT: OpenAl's models, including ChatGPT, are proprietary, with access provided through APls and specific platforms.
  • ??DeepSeek: In contrast, DeepSeek has embraced an open-source approach, releasing models like DeepSeek-R1 to the public. This openness fosters collaborative innovation and allows for broader adoption and adaptation within the Al community.

 

Conclusion

While both DeepSeek and ChatGPT are advanced language models excelling in natural language understanding and generation, they differ in architectural design and training methodologies.

DeepSeek's MoE architecture offers a more resource-efficient alternative to the dense transformer model employed by ChatGPT, potentially democratizing access to advanced Al capabilities through its open-source model.

 

DeepSeek's Impact on the Al Industry

w wired.com

How Chinese Al Startup DeepSeek Made a Model that Rivals OpenAl

Jan 25, 2025

/theverge.com

Why everyone is freaking out about DeepSeek

Jan 28, 2025

© theguardian.com

Tuesday briefing: How an unknown Chinese startup wiped $593bn from the value of an Al giant

Jan 27, 2025

所有跟帖: 

方兄早,新春快樂!haven't usedDSyet, but DS news seem to change daily -最西邊的島上- 給 最西邊的島上 發送悄悄話 最西邊的島上 的博客首頁 (0 bytes) () 01/30/2025 postreply 10:28:34

U probably have seen discussions in 投壇&幾壇. 2me, DS's bias is -最西邊的島上- 給 最西邊的島上 發送悄悄話 最西邊的島上 的博客首頁 (0 bytes) () 01/30/2025 postreply 10:30:55

a killer, if what others found r true. notSure SW piracy now -最西邊的島上- 給 最西邊的島上 發送悄悄話 最西邊的島上 的博客首頁 (0 bytes) () 01/30/2025 postreply 10:35:23

DeepSeek R1 is now available on Azure AI Foundry and GitHub -方外居士- 給 方外居士 發送悄悄話 (754 bytes) () 01/30/2025 postreply 10:38:58

Microsoft makes Deepseek’s R1 model available on Azure AI a -方外居士- 給 方外居士 發送悄悄話 (0 bytes) () 01/30/2025 postreply 10:40:24

just checked, mfst stock dropped 6+% today so far. Wow! -最西邊的島上- 給 最西邊的島上 發送悄悄話 最西邊的島上 的博客首頁 (0 bytes) () 01/30/2025 postreply 10:47:21

Microsoft and Meta Have a DeepSeek Strategy: Copy and Surpas -方外居士- 給 方外居士 發送悄悄話 (120 bytes) () 01/30/2025 postreply 11:01:22

potentially democratizing access to advanced Al capabilities -暖冬cool夏- 給 暖冬cool夏 發送悄悄話 暖冬cool夏 的博客首頁 (0 bytes) () 01/30/2025 postreply 12:33:15

說得好!我昨晚也下載了Deepseek! 方兄新春快樂! -暖冬cool夏- 給 暖冬cool夏 發送悄悄話 暖冬cool夏 的博客首頁 (0 bytes) () 01/30/2025 postreply 12:33:41

我還沒下載。什麽時候用呢? -妖妖靈- 給 妖妖靈 發送悄悄話 妖妖靈 的博客首頁 (0 bytes) () 01/30/2025 postreply 12:54:34

隨時可以用啊,有問題啊,不確定時:)就像ChatGPT這樣用吧,我想。妖mm新春快樂! -暖冬cool夏- 給 暖冬cool夏 發送悄悄話 暖冬cool夏 的博客首頁 (0 bytes) () 01/30/2025 postreply 13:18:15

DS wiped $593bn. 所謂不戰而屈人之兵。 -妖妖靈- 給 妖妖靈 發送悄悄話 妖妖靈 的博客首頁 (0 bytes) () 01/30/2025 postreply 12:49:28

若真如此,這個泡沫夠大的 -方外居士- 給 方外居士 發送悄悄話 (0 bytes) () 01/30/2025 postreply 14:03:54

I downloaded DS, but lately it's been pretty slow. -暮子默- 給 暮子默 發送悄悄話 暮子默 的博客首頁 (127 bytes) () 01/30/2025 postreply 16:46:23

Yes, today it becomes very slow. Yesterday it was ok -方外居士- 給 方外居士 發送悄悄話 (0 bytes) () 01/31/2025 postreply 07:50:15

請您先登陸,再發跟帖!