有比;單片的

本帖於 2025-11-03 23:05:40 時間, 由普通用戶 胡雪鹽8 編輯

https://www.tomshardware.com/tech-industry/artificial-intelligence/deepseek-research-suggests-huaweis-ascend-910c-delivers-60-percent-nvidia-h100-inference-performance?utm_source=chatgpt.com

 

 

Ascend 910C vs NVIDIA H100 vs AMD MI300X
Specification Huawei Ascend 910C NVIDIA H100 (SXM5) AMD MI300X
FP16 Performance 800 TFLOPS 989 TFLOPS (Sparsity: 1979) 1307.4 TFLOPS (Sparsity: 2614.8)
INT8 Performance ~1600 TOPS ~1979 TOPS (Sparsity: 3958) 2614.9 TOPS (Sparsity: 5229.8)
Memory 128GB HBM3 80GB HBM3 192GB HBM3e
Memory Bandwidth 3.2 TB/s 3.35 TB/s 5.3 TB/s
Power Consumption (TDP) ~310W (potentially higher) Up to 700W 750W
Software Ecosystem CANN, MindSpore, PyTorch, TensorFlow CUDA, cuDNN, TensorRT ROCm, HIP

Note: NVIDIA and AMD often quote performance with sparsity features; dense compute figures are used for a more direct comparison where possible.

 

 

 

Feature Huawei Ascend 920 (Claimed/Projected) NVIDIA H100 (SXM/PCIe)
Architecture Huawei Da Vinci Architecture (Chiplet-based) NVIDIA Hopper
Process Node SMIC 6nm (Projected) TSMC 4nm (Custom)
FP16/BF16 Compute 900 TFLOPS (BF16, per card) 1,513 TFLOPS (BF16, without sparsity)
FP8 Compute Not widely published/clear for 920, but its predecessor (910C) is lower. 3,026 TFLOPS (FP8, without sparsity)
Memory Bandwidth 4.0 TB/s (HBM3) 3.35 - 3.9 TB/s (HBM3)
GPU Memory (VRAM) Likely high (The predecessor, 910C, has 128GB HBM3) 80GB (HBM3)
Software Ecosystem CANN (Requires porting, less mature) CUDA (Industry standard, highly mature)
Primary Market China (Strong domestic focus) Global

所有跟帖: 

主要是生態CUDA依賴太高,切換成本很大 -霸天虎- 給 霸天虎 發送悄悄話 (186 bytes) () 11/03/2025 postreply 23:18:01

老共應該在高校提供免費的國內版本,讓學生習慣中國的AI生態。年輕人習慣比較快 -硬碼工- 給 硬碼工 發送悄悄話 (0 bytes) () 11/04/2025 postreply 00:30:34

沒有那麽容易的 -霸天虎- 給 霸天虎 發送悄悄話 (253 bytes) () 11/04/2025 postreply 01:17:38

請您先登陸,再發跟帖!