Linley Newsletter

ThunderX3 Brings the Boom

August 25, 2020

Author: Aakash Jani

ThunderX3 is Marvell’s newest high-end server CPU, delivering x86-class performance in a custom Arm core. It greatly improves on its predecessor, ThunderX2, offering 30% more single-thread performance at a similar clock speed. The older CPU was based on Broadcom’s Vulcan design; ThunderX3 shows how the microarchitecture has evolved since being acquired by Cavium, which Marvell later purchased. The company introduced its custom microarchitecture at the recent Hot Chips conference.

Marvell applied many incremental changes to boost performance. Larger caches and buffers yield greater performance per thread by handling the increased throughput. By widening the back-end execution engine through additional ports, the company reduced congestion for branch execution. The design includes a handful of algorithm changes, but improvements in micro-op expansion and branch prediction are the most important. ThunderX3 reduces the latency of floating-point calculations through further optimizations to its floating-point units.

The new CPU comes in two processor models: a single-die version with 60 cores and a dual-die version with 90 cores. The single-die variant contains 90MB of distributed L3 cache and has a maximum frequency of about 3.0GHz. It features eight-channel DDR4-3200 memory and PCIe Gen4 interconnect technology. The 60 cores connect through a switched ring that provides 1.5MB of L3 cache per core. This version should double the multithread performance of the 32-core ThunderX2. The company is testing engineering samples of the single-die model; we expect production to start in 1Q21, with the dual-die model following by the end of next year.

