Domestic 2nm AI GPU Unveiled, Taking First Step to Break Global Monopoly

Release date：2026-04-14 Number of clicks：202

On April 13, Shanghai Dishan Technology revealed its latest progress on a cutting-edge 2nm high-performance AI GPU, with chip design reaching world-class standards. The chip has entered a critical prototype verification phase and is expected to enter tape-out and mass production within 1–2 years.

This 2nm AI GPU adopts a hybrid FinFET/GAA process and Chiplet heterogeneous integration architecture, equipped with the self-developed DS-Core. It integrates 170 billion transistors on an 800mm² die, paired with 2.5D CoWoS-L advanced packaging for high-density interconnection and optimized thermal performance.

In core computing power, the chip delivers 50 TFLOPS FP32, 100 TFLOPS FP16, and 400 TFLOPS FP4, supporting full scenarios of large-model training and inference. Energy efficiency is improved by 40% compared with the previous generation, with typical power consumption controlled under 350W and 142 GFLOPS per watt.

The R&D team has overcome three key bottlenecks: HBM4 packaging interconnection, ultra-low-latency inter-chip communication, and microchannel thermal management. It supports single-chip 48GB HBM4 memory with speeds over 11Gb/s and bandwidth up to 3.2TB/s, around 2.5 times higher than HBM3E. Inter-chip communication latency is below 0.25ns/mm, and microchannel cooling reduces thermal runaway risk by 68%, maintaining operating temperatures below 85℃.

The GPU supports NVLink 6-compatible protocols with 1.6TB/s single-link bandwidth for smooth multi-chip scaling. It is also compatible with the CUDA ecosystem, greatly lowering customer migration costs, and has secured preliminary cooperation intentions with leading domestic cloud service providers and autonomous driving companies.

Current development focuses on system-level verification, timing closure optimization, yield simulation, and software ecosystem adaptation. An AI inference acceleration kit prototype is planned for launch by the end of 2026, along with a CUDA-compatible compiler and AI framework adaptation layers.

In global comparison, the chip’s FP32 performance is comparable to NVIDIA H100/H200, marking a breakthrough in ending the monopoly of international giants in high-end AI chips. However, gaps remain in mass production schedules, ecosystem maturity, and market validation.

ICgoodFind：Dishan’s 2nm AI GPU demonstrates world-leading design capabilities backed by HBM4 and proprietary architecture, representing a major leap for China’s high-end computing chip industry.

Home

TELEPHONE CONSULTATION

Semiconductor Technology