性能基准 - 深超智算

性能基准 Performance Benchmarks

实测数据，拒绝虚标。深超智算在标准 AI 负载环境下，针对主流大模型训练与推理任务进行了严苛的性能压测，旨在为您提供最真实的算力预期。

SCZS Galactic (H100 x16) 98.2 TFLOPS

Industry Standard Node 62.4 TFLOPS

测试负载: Llama-3 70B FP8 Pre-training

SCZS Quantum (vLLM Opt) 12ms

General Cloud Instance 45ms

测试负载: DeepSeek-V2 Inference (Triton)

SCZS Immersion Liquid 1.42 x

Standard Air Cooling 0.88 x

测试负载: 7x24h Full Load Stability Test

测试维度	S1 (Quantum)	S2 (Nebula)	S4 (Galactic)
FP8 Tensor 峰值算力	3.2 PFLOPS	6.4 PFLOPS	12.8 PFLOPS
单卡 HBM 带宽	3.35 TB/s	3.35 TB/s	3.35 TB/s
集群通信延迟 (NVLink)	Low	Ultra-Low	Zero-Loss
Llama-3 70B 推理速度	120 tokens/s	280 tokens/s	540 tokens/s

需要针对您的特定模型进行性能预测吗？