» Current | 2017 | 2016 | 2015 | Subscribe

Linley Newsletter

Google TPU Boosts Machine Learning

May 9, 2017

Author: David Kanter

A leader in the deployment of machine learning, Google uses neural networks and other inference techniques that apply weights to input data to classify incoming email as spam, recognize speech, and perform other tasks. The company has tremendous compute resources across its data centers and a penchant for designing its own hardware and software. It is now the first company to have designed and deployed a custom processor that accelerates machine learning. Google describes this processor in paper that it will present at the International Symposium on Computer Architecture next month.

The “tensor processing unit” (TPU) is a 28nm accelerator that offloads neural-network inferencing based on the open-source TensorFlow library. It offers roughly 10x better performance than comparable 28nm GPUs and 22nm CPUs. The TPU operates at 700MHz; it could have been faster, but Google focused on a short schedule rather than high operating speed.

When users interact with neural-network-based services (such as Google Now), they expect a prompt response. Thus, Google places particular emphasis on latency and quality-of-service guarantees when evaluating neural-network inferencing. Rather than using complex vector units, caches, and DRAM, the TPU reduces latency using a simple design based on a massive array of 256x256 multiply-accumulate (MAC) units as well as explicitly addressed SRAM. It relies on a host processor for higher-level functions.

The TPU is mounted on an add-in card that connects to the host processor using a PCIe 3.0 link with 16 lanes. The PCIe connector also provides power, limiting the board to about 75W. The TPU comes with 8GB of ECC-protected DDR3 memory that stores the read-only weights of multiple inferencing models; inferencing can therefore operate primarily on the TPU with minimal CPU overhead.

Subscribers can view the full article in the Microprocessor Report.

Subscribe to the Microprocessor Report and always get the full story!

Purchase the full article

Events

Linley Processor Conference 2017
Covers processors and IP cores used in embedded, communications, automotive, IoT, and server designs.
October 4 - 5, 2017
Hyatt Regency, Santa Clara, CA
Register Now!
More Events »

Newsletter

Linley Newsletter
Analysis of new developments in microprocessors and other semiconductor products
Subscribe to our Newsletter »