DeepSeek疑似抄襲ChatGPT: 技術及數據源對比分析

隨著DeepSeek的出現,有關其是否在開發過程中抄襲了ChatGPT的技術的討論逐漸增多。本文將基於對比實驗,探討DeepSeek是否借用ChatGPT的技術,並揭示其在技術實現方麵可能存在的相似性和差異性。

一、驗證DeepSeek的思路

一般來說,驗證兩個係統是否相同,最直接的方法是比較它們在相同輸入條件下的輸出結果。如果兩個係統在處理同一問題時給出的答案完全一致,則可以推測這兩個係統在算法或架構上存在高度的相似性,甚至可能是相同的。在本研究中,主要采用以下兩種驗證方法:

  1. 信息一致性檢驗
    首先,通過從兩個不同的數據庫中調取相同的信息,觀察其輸出結果。如果兩個數據庫返回的結果完全一致,那麽這兩個數據庫的底層結構很可能是相同的。
  2. 特殊變量【MASK】的使用
    利用特殊的變量【MASK】獲得可能性詞匯,檢驗兩種算法是否等同.  具體來說,[MASK] 是一個占位符,表示在這個位置需要填充一個詞語。模型會根據句子中的其他詞語(即上下文)推理算法,預測最合適的詞語,並將其替換到 [MASK] 的位置。通過比較DeepSeek與ChatGPT在相同輸入下對【MASK】位置的填充結果,檢驗兩者的推理機製是否一致。

隨機抽取了67個檢測樣本進行對照檢驗, 發現DeepSeek與ChatGPT具有高度的相似性。下麵舉具體驗證例子實例和結果.

實例1

輸入句子:
Up to 30 [MASK] and babies died at Furness General Hospital because of failings by staff and management, a damning report is [MASK] to reveal.

ChatGPT輸出:
Up to 30 mothers and babies died at Furness General Hospital because of failings by staff and management, a damning report is expected to reveal.

DeepSeek輸出:
Up to 30 mothers and babies died at Furness General Hospital because of failings by staff and management, a damning report is expected to reveal.

在這個示例中,DeepSeek和ChatGPT在預測【MASK】位置時,給出填充“mothers”和“expected”結果完全一致。

實例2

輸入句子:
Taking in the sights of [MASK] is nothing short of [MASK], with its famous souk which houses over 60,000 stalls full of colorful handmade wares.

ChatGPT輸出:
Taking in the sights of Marrakech is nothing short of breathtaking, with its famous souk which houses over 60,000 stalls full of colorful handmade wares.

DeepSeek輸出:
Taking in the sights of Marrakech is nothing short of breathtaking, with its famous souk which houses over 60,000 stalls full of colorful handmade wares.

在這個例子中,DeepSeek和ChatGPT也給出了完全相同的預測詞匯“breathtaking”和“Marrakech”,驗證了兩者在推理和預測時的一致性。

實例3

輸入句子:
Ultimately, the goal of life is a [MASK] that each individual must define for themselves based on their own beliefs, values, and experiences.

ChatGPT輸出:
Ultimately, the goal of life is a journey that each individual must define for themselves based on their own beliefs, values, and experiences.

DeepSeek輸出:
Ultimately, the goal of life is a journey that each individual must define for themselves based on their own beliefs, values, and experiences.

在這個實例中,“journey”這一填充詞在DeepSeek和ChatGPT的輸出中完全一致. 

實例4和實例5是把由不同部分組成的大段訊息進行比較, 任何微小部分的不同, 都可以導致信息的差異性產生. 同時如果算法不同, 也可以導致結果不同. 但是如果信息完全一致且算法也一致的情況下, 結果應是等同的.

實例4

輸入句子:
I purchased the variety pack of Martinsons and the Brown Gold they are both from the same vendor. I did not like the taste of the Martinson product at all. I threw it away, not the product I expected at all. The Brown Gold was more palatable, however certain flavors with this brand also feel short in particular the Costa Rican named product was extremely too bitter. I would agree with most that this does give you a lower cost per serving, but when the coffee is inferior and the taste match is also, I think the best descriptor is [MASK].

ChatGPT輸出:
cheap

DeepSeek輸出:
cheap

這個例子展示了兩者都將【MASK】位置填充為“cheap”,進一步證明它們在算法和推理機製上是相同的。

 

實例5

輸入句子:
Not bad. "These are small and very salty. The taste is good, but very strong, so it's a good thing the package contains a small amount. It only takes a few little crisps to cure my salty/crunchy craving. I can snack on one package for an entire day. Of course, these would not be a good snack if you're very hungry, because there isn't enough there to fill you up. For less than $1 per pack, it's an [MASK].

ChatGPT輸出:
"For less than $1 per pack, it's an okay deal."

DeepSeek輸出:
"For less than $1 per pack, it's an okay deal."

在這個示例中,DeepSeek和ChatGPT在預測【MASK】位置時,給出填充“For less than $1 per pack, it's an okay deal.”結果完全一致。

從上述的對比實驗和技術分析可以得出結論,在使用【MASK】變量的測試中,DeepSeek和ChatGPT在所有樣本中的輸出結果完全一致,表明它們采用了相同的推理算法, 技術框架和數據源。由於DeepSeek與ChatGPT之間高度的相似性,DeepSeek的技術可能涉嫌抄襲。

 




更多我的博客文章>>>

所有跟帖: 

deep fake -青雨紫煙- 給 青雨紫煙 發送悄悄話 青雨紫煙 的博客首頁 (0 bytes) () 02/04/2025 postreply 20:30:21

fake news嚇不住川總,deep fake把他們都嚇著了 :) -manyworlds- 給 manyworlds 發送悄悄話 (0 bytes) () 02/04/2025 postreply 22:05:10

原代碼都公開了,有沒有抄襲看不懂嗎? -波粒子3- 給 波粒子3 發送悄悄話 (0 bytes) () 02/04/2025 postreply 20:47:28

即使是抄的都把美國和歐洲嚇成這樣,也忒不經嚇唬了,lol -manyworlds- 給 manyworlds 發送悄悄話 (0 bytes) () 02/04/2025 postreply 22:03:21

是有這種可能。有兩種可能1、抄襲META的,那個是公開的。2、抄襲ChatGPT的,這個因為是閉源的,必須裏麵的人偷出來 -精木- 給 精木 發送悄悄話 精木 的博客首頁 (0 bytes) () 02/04/2025 postreply 21:06:58

其實很多產品,包括蘋果,市場上很多假冒,但質量、性能和蘋果幾乎一模一樣的。也有係列號,一問,是裏麵的人偷出來的。 -精木- 給 精木 發送悄悄話 精木 的博客首頁 (0 bytes) () 02/04/2025 postreply 21:13:20

包括裏麵的程序也被盜了。這說明,在中共國開工廠、設立研發分支機構,其實就是自尋死路,將商業機密拱手讓人。 -精木- 給 精木 發送悄悄話 精木 的博客首頁 (0 bytes) () 02/04/2025 postreply 21:16:04

-又當爹來又當媽- 給 又當爹來又當媽 發送悄悄話 又當爹來又當媽 的博客首頁 (176 bytes) () 02/05/2025 postreply 03:27:46

DeepSeek與ChatGPT一樣,被指出錯誤就會改。貼主則死不認錯。誰更先進? -監考老師- 給 監考老師 發送悄悄話 監考老師 的博客首頁 (0 bytes) () 02/05/2025 postreply 03:50:24

這些填空題不多用幾個係統測一下,什麽也說明不了。 -監考老師- 給 監考老師 發送悄悄話 監考老師 的博客首頁 (0 bytes) () 02/05/2025 postreply 03:52:26

比如我剛問Grok, Furness General Hospital 這題,結果答案一樣,那GPT肯定抄Grok了? -監考老師- 給 監考老師 發送悄悄話 監考老師 的博客首頁 (968 bytes) () 02/05/2025 postreply 15:43:54

大聰明 -lue96500- 給 lue96500 發送悄悄話 (0 bytes) () 02/06/2025 postreply 18:20:39

請您先登陸,再發跟帖!