» Current | 2022 | 2021 | 2020

Linley Newsletter

Cerebras Dives Into WSE Architecture

September 20, 2022

Author: Linley Gwennap

Secretive Cerebras spilled some more beans at last month’s Hot Chips conference, disclosing new details about its tiny compute core and how it can process even the biggest AI models. The startup is known for its wafer-scale processor, now in its second generation. The WSE2 packs 850,000 of these cores into a slab of silicon the size of a baking pan. The accelerator contains 40GB of SRAM and can generate 7,500 trillion FP16 operations per second (Tflop/s). Maximum power for the CS-2 system, which contains the WSE2, is 23,000 watts.

For years, Cerebras withheld the flops rate of its design because despite its sheer magnitude, the design falls short of the leading GPUs when measured in flop/s per watt or per unit die area. But large neural networks are typically limited by memory size and bandwidth, making the compute rate less relevant. In the WSE2, each FP16 multiply-accumulate (MAC) unit can access 12KB of stored operands at full speed; for Nvidia’s new Hopper GPU, the corresponding figure is only 128 bytes.

Nearly all AI models have fewer than 20 billion weights and can fit entirely in the WSE2’s memory. For the biggest ones, Cerebras offers a separate MemoryX box that can store trillions of weights in DRAM and stream them to one or more CS-2 systems. This approach enables the company to take advantage of model sparsity by removing zero weights instead of feeding them to the compute cores. Thus, the cores can focus on useful operations.

Although Cerebras promotes its systems for training these enormous models, they’ve seen deployment mainly in high-performance computing (HPC) and niche AI applications. Customers include US national laboratories in Argonne and Livermore as well as Germany’s Leibniz supercomputer center. Major corporations such as AstraZeneca, Bayer, Genentech, and GlaxoSmithKline (GSK) use the systems for biological research, whereas others perform physics simulations and similar tasks.

Subscribers can view the full article in the Microprocessor Report.

Subscribe to the Microprocessor Report and always get the full story!

Free Newsletter

Linley Newsletter
Analysis of new developments in microprocessors and other semiconductor products

Events

Linley Spring Processor Conference 2022
Conference Dates: April 20-21, 2022
Hyatt Regency Hotel, Santa Clara, CA
Linley Fall Processor Conference 2021
Held October 20-21, 2021
Proceedings available
Linley Spring Processor Conference 2021
April 19 - 23, 2021
Proceedings Available
More Events »