創新是如何發生的？

來源: 兄貴於 2024-04-04 19:56:18 [檔案] [博客] [舊帖] [給我悄悄話] 閱讀數 : (3035 bytes)

本帖於 2024-04-04 21:33:38 時間, 由普通用戶兄貴編輯

ChatGPT 引發的革命性創新來自於Google 的 Transformer architecture large language model. 其革命性來源於一個嶄新的attention model。現在所有人都轉向transformer 的 attention model. 在這之前，是 Recurrent neural networks (RNNs) 主導，那麽是誰第一個有了attention model的想法呢，是這位：

Jakob Uszkoreit ：畢業於Technische Universität Berlin，柏林技術大學。沒有 PhD 學位。在Google做Intern，然後在Google工作時憑直覺覺得 attention 比RNNs 更快，更有效，更適合並行運算。第一篇 attention model的論文四位作者（按論文排名）：

Ankur Parikh: 印度人，本科：Princeton CS；PhD：CMU Machine Learning
Oscar Täckström：瑞典人，本科：Stockholm大學哲學；PhD: Uppsala大學，CS
Dipanjan Das：印度人，本科：CMU CS。PhD：CMU CS Language Technologies
Jakob Uszkoreit：德國人，本科：柏林技術大學 CS

他們把attention model 運用在語言翻譯上，完勝 RNNs

Jakob Uszkoreit認為attention 不僅僅可以在語言上，更可以在AGI上，於是下麵8人合作發表了著名的“Attention is all you need" 一文，這篇文章被譽為改變AI曆史的文章：

Jakob Uszkoreit：德國人，本科：柏林技術大學 CS
Noam Shazeer: 祖上德國猶太人，出生在費城，本科：Duke CS。此人重寫了整個 transformer 程序
Lukasz Kaiser 波蘭人，本科 Wroclaw大學 CS，PhD 亞琛工業大學 CS，OpenAI ChatGPT核心人物
Illia Polosukhin: 烏克蘭人，本科：烏克蘭國立技術大學 CS+應用數學
Ashish Vaswani 印度人，PhD USC CS
Llion Jones 威爾士人，本科 Birmingham 大學 CS
Niki Parmar 印度人，Master of Science USC CS (和Ashish Vaswani是一對）
Aidan Gomez 加拿大人/英國人，本科：多倫多大學 CS 是Kaiser的intern，後來讀的PhD：牛津大學 CS

我的幾點觀察：

1）幾乎沒有一個美國人，但工作是在美國做的。
2) 幾乎都是 CS 專業。做ML的兩類人，一類學CS的，另一類學統計的。做transformer全是學CS的
3）靈魂人物Uszkoreit，Shazeer，Polosukhin，Kaiser都是歐洲人
4）沒多少名牌大學
5）印度人善於參與
6）一半PhD，一半沒有PhD，靈魂人物Uszkoreit，Polosukhin，Shazeer都是本科。八人中隻有2個PhD，Gomez當時是實習生，後來去牛津讀 PhD。
7）Internship 也能成大事，Aidan Gomez作為一個在Google的實習生，也因此世界聞名

當今最偉大的創新就是這些人製造的，Attention模型革命性超過ChatGPT，因為ChatGPT隻是利用Transformer的一個例子，transfomer還用在 Gemini 等等其他大語言模型，以及圖像視頻AGI（比Text文本更複雜）

您的位置：文學城 » 論壇 » 子女教育 » 創新是如何發生的？

所有跟帖：

• 怎麽不是名牌啊，各個國家的名牌不算啊。。。哈哈。。 -Midwestrural- ♂ (0 bytes) () 04/04/2024 postreply 20:07:16

• 你沒說是如何發生的啊 lol。還都隻是大力/暴力深度學習，aka深度記憶，並沒有多少真正智能 -成功的三少爺- ♂ (0 bytes) () 04/04/2024 postreply 20:07:51

• attention最大的貢獻是並行計算，提高了工程/計算效率 -成功的三少爺- ♂ (0 bytes) () 04/04/2024 postreply 20:09:07

• 做事快跟智慧有多深完全不同的概念 -成功的三少爺- ♂ (0 bytes) () 04/04/2024 postreply 20:10:41

• insdie story from WIRED -RomaVacation- ♂ (98 bytes) () 04/04/2024 postreply 20:15:55

• 華裔怎麽沒插一腳？LOL -RomaVacation- ♂ (0 bytes) () 04/04/2024 postreply 20:18:45

• 華裔貢獻了最重要的一環，NVIDIA的GPU -zeno- ♀ (0 bytes) () 04/04/2024 postreply 20:27:30

• 就是算力的大幅度提高嘛。 -Pilsung- ♂ (0 bytes) () 04/04/2024 postreply 20:33:29

• 對，早期是為了遊戲，後來又為了挖礦，所以其實所有大家覺得不務正業的東西都有正作用。 -zeno- ♀ (0 bytes) () 04/04/2024 postreply 20:35:59

• 這叫歪打正著。哈哈。。。 -Pilsung- ♂ (0 bytes) () 04/04/2024 postreply 20:39:41

• Ashish Vaswani 在USC的博士導師都是華裔 -gladys- ♂ (415 bytes) () 04/04/2024 postreply 20:41:45

• 大多是名牌大學的。怎麽沒有一個美國人？想說明什麽？查了下，Noam Shazeer出生於費城,上的Duke, 算美國人吧 -sportfan- ♂ (1792 bytes) () 04/04/2024 postreply 20:20:16

• Jason Wei的cot 作用有多大？ -yddad- ♂ (0 bytes) () 04/04/2024 postreply 20:28:12

• Prompt Engineer的祖師爺？ LOL -zeno- ♀ (0 bytes) () 04/04/2024 postreply 20:32:34

• 都是CS -月色淺淺- ♀ (0 bytes) () 04/04/2024 postreply 20:29:42

• 這篇原始文章提到了每個人的貢獻 -roger_surfer- ♂ (81 bytes) () 04/04/2024 postreply 20:32:34

• Jacob思想，Noam編程，Llia和Ashish重度長期參與，Llion和Niki測試，Lucas和Aiden整合 -兄貴- ♂ (0 bytes) () 04/04/2024 postreply 21:20:41

• 不是有一個P本科cs/math Ankur Parikh? -米湯- ♂ (0 bytes) () 04/04/2024 postreply 20:33:21

• 沒一個美國出生的作者，但他們都來到美國，在美國做成了這一切吧（a research paper by Google) -gladys- ♂ (0 bytes) () 04/04/2024 postreply 20:35:04

• 而且他們現在大部分都在美國，應該有美國國籍了吧，算美國人吧。反正美國人就是世界各地想來美國的人組成的。 -gladys- ♂ (0 bytes) () 04/04/2024 postreply 20:43:01

• 最後都是美國人，娃也都在美國生活，讀書，就業，成家。。哈哈 -Midwestrural- ♂ (0 bytes) () 04/04/2024 postreply 20:44:52

• 是的 -gladys- ♂ (0 bytes) () 04/04/2024 postreply 20:46:22

• 是這個意思 -兄貴- ♂ (0 bytes) () 04/04/2024 postreply 20:42:44

• 美國確實厲害，吸引人才 -月色淺淺- ♀ (36 bytes) () 04/04/2024 postreply 20:52:20

• Noam出生於費城，本科Duke -sportfan- ♂ (724 bytes) () 04/04/2024 postreply 21:32:48

• 都跑到美國來解決問題。。。哈哈 -Midwestrural- ♂ (0 bytes) () 04/04/2024 postreply 20:43:33

• 穀歌是個偉大的公司，即使現在不招人待見了。 -zeno- ♀ (0 bytes) () 04/04/2024 postreply 20:45:53

• 確實。。 -Midwestrural- ♂ (0 bytes) () 04/04/2024 postreply 20:47:42

• 美國有一代代新的，偉大的公司出現。中國也有苗頭的，不過一些私營公司被偉大領袖除草了 -gladys- ♂ (0 bytes) () 04/04/2024 postreply 20:48:00

• 哈哈。。 -Midwestrural- ♂ (0 bytes) () 04/04/2024 postreply 20:51:47

• 是啊，中國的偉大公司都被偉大總統拜和川盯上了 lol -manyworlds- ♂ (0 bytes) () 04/04/2024 postreply 22:18:56

• 川普想搞沒搞成。還是靠偉大領袖搞死的,以馬雲為首 -gladys- ♂ (0 bytes) () 04/04/2024 postreply 22:25:25

• 你說說古狗brain那批人，都是最先研發出來的技術，但現在GoogleAI被openAI吊打，難道不是executive -Pilsung- ♂ (47 bytes) () 04/04/2024 postreply 20:53:29

• openAI 有做出chatGPT啊，如果那篇文章是internet, chatGPT就是web -zeno- ♀ (0 bytes) () 04/04/2024 postreply 21:00:01

• 做ML的兩類人，一類學CS的，另一類學統計的。我想說的是transformer全是學CS的 -兄貴- ♂ (0 bytes) () 04/04/2024 postreply 20:48:43

• CMU ML係剛建立時主力是CS和統計係的教授，博士生來自CS和統計係的博士生和自己的科研masters -whaled- ♂ (171 bytes) () 04/04/2024 postreply 20:55:50

• 你說的是AI major 吧，另外在 CS專業下有 ML concentration -兄貴- ♂ (374 bytes) () 04/04/2024 postreply 21:03:55

• ML係前身是Cent for Adv Lrning & Knowledge Discovery.人馬是CS和統計係教授。 -whaled- ♂ (192 bytes) () 04/04/2024 postreply 21:09:42

• 另外，Internship 也能成大事，Aidan Gomez作為一個在Google的實習生，也因此世界聞名 -兄貴- ♂ (0 bytes) () 04/04/2024 postreply 21:34:02

• 讚！謝謝好文。覺得是作了統計，還沒有深入research 這些人的內在動因。-：） -有言- ♂ (0 bytes) () 04/04/2024 postreply 21:50:49

• The attention mechanism最先運用在語音翻譯上，太有道理了 -pct- ♂ (0 bytes) () 04/04/2024 postreply 22:36:42

請您先登陸，再發跟帖！