NVIDIA confirms Ada 102/103/104 GPU specs, AD104 has more transistors than GA102

NVIDIA Ada GPUs have a significantly higher ROP count

NVIDIA clarifies the specifications for the RTX 40 series.

The company released full information on the chip sizes and transistor counts of the AD102, AD103 and AD104 GPUs. All three are scheduled to launch in the coming weeks. NVIDIA already provided key numbers for the AD102 GPU, the flagship processor for the RTX 4090 graphics card, but details on the AD104 and AD103 were still missing. Ryan Smith of AnandTech reports for the exact numbers:

  • AD102: 608 mm² chip, 76.3 B transistors
  • AD103: 378.6 mm² chip, 45.9 B transistors
  • AD104: 294.5 mm² chip, 35.8 B transistors

That means all three Xtor densities are higher than 121M per square millimeter (it’s actually identical for AD103 and AD104). In addition, AD104 with 35.8B transistors means that it has 7.5B transistors more than the Ampere GA102 GPU flagship (28.3B). For comparison, GA102 is more than twice the size of AD104.

VideoCardz.com AD102 AD103 AD104
Architecture Ada Lovelace Ada Lovelace Ada Lovelace
process node TSMC 4N (5nm) TSMC 4N (5nm) TSMC 4N (5nm)
transistors 76.3B 45.9B 35.8B
The size 608mm² 378.6mm² 294.5mm²
transistor density 125.5 million 121.1 million 121.1 million
streaming multiprocessors 144 80 60
CUDA cores 18432 10240 7680
tensor cores 576 320 240
RT cores 144 80 60
ROPs 192 112 80
L2 cache 96MB 64MB 48MB
item number RTX4090 RTX4080 16GB RTX4080 12GB
NVIDIA Ada GPUs have a much higher number of Render Output Units (ROP) than its predecessor, going as high as 192 ROPs for AD102. The AD103 GPU has the same number of ROPs as GA102 (112), while AD104 had 80. A higher ROP number should improve rasterization performance.

NVIDIA has made some architectural changes such as: B. Removing NVLink as explained to make room for other logical blocks. But at the same time, the L2 cache has increased significantly. NVIDIA has now confirmed the exact size for each SKU: AD102 96MB, AD103 64MB, and AD104 48MB. Both RTX 4080 models are confirmed to have fully unlocked L2 cache on their respective GPUs, so 4080 16GB has 64MB while 4080 12GB has 48MB.

Aside from that, HKEPC reports that NVIDIA has also clarified what TSMC 4N really means, which is not to be confused with N4. This process is a chip shrink of the TSMC 5N process, but it’s still a 5nm architecture. The only problem with this “clarification” is that NVIDIA itself provides incorrect information on the 4nm process, as shown below (slide from this week’s Editors Day).


Source: Ryan Smith (AnandTech), HKEPC

