我剛剛問了ChatGPT 同樣的問題,回答正相反。奇怪!
問:Is deepseek’s r1 model distillated from ChatGPT?
答:No, DeepSeek’s R1 model is not distilled from ChatGPT. The R1 model, developed by DeepSeek, is built independently with its own architecture and training process. While both DeepSeek’s R1 model and ChatGPT belong to the family of large language models (LLMs), they are separate developments created by different organizations, each with unique design choices and training methodologies.
DeepSeek’s R1 is based on a Mixture-of-Experts (MoE) architecture, which activates a subset of the model’s parameters during inference to enhance computational efficiency. This differs from the dense transformer model used by ChatGPT, where all parameters are active during inference.
DeepSeek models, including R1, are generally trained on diverse multilingual data and fine-tuned for various specialized tasks, which may include advanced reasoning and math. Although both models are designed for similar tasks, like text generation, they have distinct technical foundations and are not directly derived from one another.