| Order a report

A Guide to Multicore Processors

Third Edition

Published May 2016

Authors: Jag Bolaria and Tom R. Halfhill

Single License: $4,495 (single copy, one user)
Corporate License: $5,995

Ordering Information



Get the Facts Quickly

"A Guide to Multicore Processors" provides an in-depth look at 32- and 64-bit high-speed embedded processors with four or more CPU cores. This completely revised report from The Linley Group contains 190+ pages of information on high-end processors from AMD, AppliedMicro, Baikal Electronics, Broadcom, Cavium, Intel, Mellanox (Tilera/EZchip), and NXP.

The report focuses on general-purpose RISC and x86 processors that have four or more CPU cores running at 1.0GHz or more, excluding specialized architectures (e.g. DSPs, NPUs). This report covers processors for embedded applications, focusing on networking, communications, storage, and security; it excludes multicore products designed for servers or for mobile devices. (We cover these processors, as well as embedded processors with fewer than four CPU cores, in other reports.)

“A Guide to Multicore Processors” delivers detailed coverage of all applicable products in AMD’s Opteron A1100 family; AppliedMicro’s Helix family; Broadcom’s XLP II family; Cavium’s Octeon III and ThunderX families; Intel’s embedded Xeon and Xeon-D lines; Mellanox’s Tile-Gx family; and NXP’s QorIQ T series, LS1 series, and LS2 series.

This handy guide, packed with valuable information, brings you up-to-date on the newest developments in this important market and gives you the analysis you need to help choose a supplier or partner. The report also provides market-share and market-size data for the embedded and multicore markets.

“A Guide to Multicore Processors” begins with tutorials on the key technologies implemented by these products, background on the embedded market, and a discussion of the newest technology and market trends. Following these introductory chapters, the report delivers thorough coverage of all announced products in this area. For each major vendor, the report examines the performance, features, and architecture of each product, highlighting strengths and weaknesses in a consistent, easy-to-compare fashion. The report concludes with our own comparisons of these products and conclusions about which will fare best.

What's New in This Edition

Since publishing the previous edition of this report in 2014, we have updated the coverage to include many new announcements, including:

  • AMD’s new Opteron A1100 family
  • AppliedMicro’s Helix processors
  • More detailed coverage of Cavium’s ARMv8-compatible ThunderX processors
  • NXP’s newest ARM-based LS1- and LS2-series processors
  • Intel’s new Xeon and Xeon D processors
  • Final 2015 market size and vendor share
  • Embedded-processor forecasts to 2020

Multicore processors offer the best performance and flexibility for applications that are divisible into many small tasks, called threads. In embedded systems, the most common application for these products is networking, because each data packet can usually have its own thread. Packet processing is common in a wide range of networking and communications equipment, including routers, security appliances, storage subsystems, broadband infrastructure, and cellular base stations.

To ease programming, these multicore processors employ general-purpose instruction sets, such as x86, the Power Architecture (PowerPC), MIPS, and ARM. This characteristic distinguishes them from dedicated network processors (NPUs), which use custom instruction sets that are more difficult to program — and from packet-processing ASICs, which are not programmable at all. Most multicore embedded processors also include specialized hardware that accelerates packet-processing tasks. Thus, they are widely favored for complex networking applications that require programmability, customization, and high performance. In addition, these devices are useful for a broad range of embedded systems that require general-purpose programmability.

We estimate the total revenue from general-purpose embedded processors fell 1% in 2015 after two years of growth. This decline was largely due to China’s slowdown in wireless-base-station deployments and a trend toward using more ASICs. Growth took place in other segments, however, such as security, Internet gateways, automotive, industrial, and storage.

Intel still leads the embedded-processor market by revenue. Despite their relatively high power consumption and relatively poor feature integration, Intel’s products offer the industry’s best single-thread performance — a big advantage in control-plane processing. The company’s recent acquisition of Altera, the second-largest FPGA vendor, creates opportunities for future products that integrate embedded processors with programmable logic. In 2015, Intel also became the leading supplier of multicore processors for communications systems — a position held for many years by Freescale, which suffered from the wireless slowdown.

Swept up by a wave of industry consolidation in 2015, Freescale was acquired by NXP. The main motive was to augment NXP’s positions in automotive processors and microcontrollers — markets in which Freescale also excels. But despite the China slowdown, Freescale’s QorIQ embedded processors remain formidable competitors, and the company is rapidly introducing new ARM-based products to supplement the existing Power Architecture chips. Its broad line of high-performance embedded processors addresses many applications.

The third-largest embedded-processor supplier in 2015 was Broadcom, which was acquired by Avago. (The combined company operates as Broadcom Ltd.) This vendor gained share during the year, largely on the success of its ARM-based StrataGX family. The MIPS-compatible XLP family fared less well. Broadcom is pinning future hopes on its Vulcan processors, which will use a new 64-bit ARMv8-compatible CPU that will enable the company to pursue new markets. The first Vulcan chips have yet to appear, however, and the project may be affected by the layoffs, cutbacks, and reorganizations that are following the Avago merger.

Cavium, the fourth-largest embedded-processor supplier, enjoyed in 2015 another year of healthy growth. The MIPS-compatible Octeon chips are the cash cow. Although their relatively simple MIPS64-compatible CPUs lag in single-thread performance, their small size enables Cavium to create large multicore designs — up to 48 CPUs in the largest Octeon III model. Consequently, the company focuses on the data plane, where its many small CPUs and wealth of hardware accelerators are ideal. Cavium also began reaping some revenue in 2015 from its new ThunderX family, which uses custom-designed 64-bit CPUs that are ARMv8 compatible. The largest ThunderX chip also has 48 CPUs.

In addition to the NXP-Freescale and Avago-Broadcom megamergers, Mellanox acquired EZchip, which had recently acquired startup Tilera. These deals could strengthen Tilera’s position and its project to make the world’s largest ARM-based embedded processor, the 100-core Tile-Gx100. In the past, Mellanox has preferred to sell systems and board-level products instead of merchant silicon, so the combined company is mapping a new strategy for 2016 and beyond.

AMD entered the ARM-based embedded-processor market in 2015 with its Opteron A1100 family, and AppliedMicro followed its ARMv8-compatible Helix 1 embedded processors with the second-generation Helix 2. Both product lines have their strong points but face stiff competition from the leading vendors. Nevertheless, they are further signals of the market’s strong shift to ARM by every major vendor except Intel. This transition will continue into the next decade, because many embedded systems have long lifespans, and developers need time to port their software.

List of Figures
List of Tables
About the Authors
About the Publisher
Preface
Executive Summary
1 Processor Technology
Processor Basics
Central Processing Unit (CPU)
Caches
MMUs and TLBs
Bus Bandwidth
CPU Microarchitecture
RISC Versus CISC
Endianness
Scalar and Superscalar
Instruction Reordering
Pipelining and Penalties
Branch Prediction
Multicore Processors
Multithreading
Main Memory
DRAM Basics
DDR Versions
Memory Subsystems
I/O and Network Interfaces
Ethernet Interfaces
PCI and PCI Express
RapidIO
USB
SAS and SATA
2 Multicore Applications
Networking and Communications Equipment
Control Plane vs. Data Plane
Control-Plane Processing
Data-Plane Applications
Services Cards
Networked Storage and RAID Controllers
Security
Broadband Infrastructure
Cellular Base Stations
Common Form Factors
3 Standard Instruction Sets
Architecture Comparison
Technology
Market Positions
x86 Instruction Set
Background
Initial Instruction Set
Modern Extensions
ARM Instruction Set
Background
Initial Instruction Set
Later Extensions
ARMv8 Architecture
ARMv8-M Architecture
MIPS Instruction Set
Background
Initial Instruction Set
Later Extensions
PowerPC Instruction Set
Background
Instruction Set
4 Multicore Processors
What Is an Embedded Multicore Processor?
What Is Not an Embedded Multicore Processor
Common Characteristics
Standalone vs. Integrated Processors
Multicore Processors
Encryption Engines
RAID and Other Storage Engines
Packet-Processing Accelerators
Benchmarks
CPU Benchmarks
Security Performance
5 Technology and Market Trends
Technology Trends
Architecture
Integration Trends
CPU Complexity Tradeoffs
Memory Access
Completeness
Market Overview
Market Size by Vendor
Market Share by Application
Revenue Market Share by Instruction-Set Architecture
Market Forecast
6 AppliedMicro
Company Background
Key Features and Performance
Internal Architecture
Potenza CPU
System Design
Product Roadmap
Conclusions
7 Broadcom
Company Background
Key Features and Performance
XLP II Overview
XLP500 Series
XLP900 and XLP700 Series
Internal Architecture
System Design
Development Tools
Product Roadmap
Conclusions
8 Cavium
Company Background
Key Features and Performance
Octeon III Processors
Octeon III CN78xx and CN77xx Series
Octeon III CN73xx and 72xx Series
ThunderX CN88xx Series
Internal Architecture
Octeon III CPU
Custom MIPS64 Extensions
Octeon III Caches
Octeon III Accelerators
ThunderX Architecture
System Design
Development Tools
Product Roadmap
Conclusions
9 Intel
Company Background
Product Overview
Key Features and Performance
Xeon E5
Xeon D
Internal Architecture
System Design
Xeon E5v3
Xeon D
Development Tools
Product Roadmap
Conclusions
10 Mellanox (EZchip)
Company Background
Key Features and Performance
Tile-Gx Family
Tile-Mx Family
Internal Architecture
Tile-Gx Family
System Design
Development Tools
Product Roadmap
Conclusions
11 NXP (Freescale)
Company Background
Key Features and Performance
QorIQ T4-Series Processors
QorIQ LS1-Series Processors
QorIQ LS2-Series Processors
Internal Architecture
Power e6500 CPU
ARM Cortex-A57 CPU
ARM Cortex-A53 CPU
ARM Cortex-A72 CPU
Security Engines
QorIQ Packet-Processing Acceleration (DPAA)
DPAA2 Packet Acceleration
System Design
System Interfaces
Application Examples
Development Tools
Product Roadmap
Conclusions
12 Other Vendors
AMD
Company Background
Key Features and Performance
Design Details
Conclusions
Baikal
Company Background
Key Features and Performance
Conclusions
13 Comparisons
Sub-30W Processors
30-50W Processors
50-100W Processors
Processors Consuming More Than 100W
14 Conclusions
Market and Technology Trends
Vendor Outlook
Intel
Cavium
NXP (Freescale)
Broadcom
Other Multicore-Processor Vendors
Closing Thoughts
Appendix: Further Reading
Index
Figure 1‑1. Basic processor design.
Figure 1‑2. Simple superscalar processor design.
Figure 1‑3. CPU pipelining examples.
Figure 1‑4. Generic multicore processor.
Figure 1‑5. Interleaved tasks on a multithreaded CPU.
Figure 1‑6. DRAM evolution.
Figure 2‑1. The control plane and the data plane.
Figure 4‑1. Standalone and integrated general-purpose processors.
Figure 4‑2. Typical curve of IPSec performance versus packet size.
Figure 5‑1. Worldwide revenue market share of embedded microprocessors, 2014 and 2015.
Figure 5‑2. Worldwide revenue market share of embedded processors for communications, 2014 and 2015.
Figure 5‑3. Worldwide revenue market share of embedded processors by instruction set, 2015.
Figure 5‑4. Forecast for embedded-processor revenue by appli¬cation, 2015–2020.
Figure 5‑5. Forecast for embedded-processor revenue by communications sub-segment, 2015–2020.
Figure 6‑1. Block diagram of AppliedMicro Potenza CPU.
Figure 6‑2. Block diagram of AppliedMicro Helix 1 processor.
Figure 6‑3. Block diagram of a gateway based on AppliedMicro Helix 1.
Figure 7‑1. Broadcom XLP II family.
Figure 7‑2. Broadcom Interchip Coherency Interface (ICI 2.0).
Figure 7‑3. VMM execution mode in MIPS64 Release 5.
Figure 7‑4. Block diagram of Broadcom GC4400 CPU core.
Figure 7‑5. Block diagram of Broadcom XLP500-series processor.
Figure 7‑6. Line card based on Broadcom XLP980.
Figure 8‑1. Cavium Octeon III family.
Figure 8‑2. Block diagram of Cavium Octeon III CN7890.
Figure 8‑3. Block diagram of Cavium ThunderX CN8890.
Figure 8‑4. Block diagram of ParPro O3E-110 card using CN7890.
Figure 9‑1. Positioning for Intel embedded processors.
Figure 9‑2. Block diagram of Intel Haswell microarchitecture.
Figure 9‑3. Block diagram of Intel Haswell embedded Xeon E5-2680v3.
Figure 9‑4. Dual-socket system design based on Intel Xeon E5v3.
Figure 9‑5. Block diagram of Intel Xeon D.
Figure 10‑1. Block diagram of Mellanox Tile-Gx72.
Figure 11‑1. NXP QorIQ T- and LS-series communications processors.
Figure 11‑2. Microarchitecture of NXP Power e6500 CPU.
Figure 11‑3. Block diagram of NXP QorIQ LS1088A.
Figure 11‑4. Second-generation Data Path Acceleration Architecture.
Figure 12‑1. Block diagram of AMD Opteron A1170.
Table 2‑1. Some common single-board-computer standards.
Table 5‑1. Worldwide revenue of the top eight vendors of embedded micro-processors.
Table 5‑2. Worldwide revenue of the top six vendors of embedded processors for communications systems.
Table 5‑3. Forecast for embedded-processor revenue by application, 2015–2020.
Table 5‑4. Forecast for embedded-processor revenue by communications sub-segment, 2015–2020.
Table 6‑1. Key parameters for AppliedMicro Helix 1 processors.
Table 7‑1. Key parameters for Broadcom XLP500 series.
Table 7‑2. Key parameters for Broadcom’s XLP900 series.
Table 8‑1. Key parameters for Cavium Octeon III CN78xx processors.
Table 8‑2. Key parameters for Cavium Octeon III CN77xx processors.
Table 8‑3. Key parameters for Cavium Octeon III CN73xx and CN72xx.
Table 8‑4. Key parameters for Cavium ThunderX processors.
Table 9‑1. Intel code names and product numbers.
Table 9‑2. Intel embedded multicore processors.
Table 9‑3. Key parameters for Intel Xeon E5v3 embedded processors.
Table 9‑4. Key parameters for Intel Xeon D embedded processors.
Table 9‑5. Key parameters for Intel DH89xx Coleto Creek chips.
Table 10‑1. Key parameters for Mellanox Tile-Gx processors.
Table 11‑1. Key parameters for NXP QorIQ T4 processors.
Table 11‑2. Key parameters for QorIQ LS1 quad- and octa-core processors.
Table 11‑3. Key parameters for QorIQ LS2 processors with Cortex-A57.
Table 11‑4. Key parameters for QorIQ LS2 processors with Cortex-A72.
Table 12‑1. Key parameters for AMD Opteron A1100 processors.
Table 13‑1. Comparison of sub-30W multicore processors.
Table 13‑2. Comparison of 30–50W multicore processors.
Table 13‑3. Comparison of 50–100W multicore processors.
Table 13‑4. Comparison of multicore processors consuming more than 100W.

Events

Linley Autonomous Hardware Conference 2017
Focusing on hardware design for autonomous vehicles and deep learning
April 6, 2017
Hyatt Regency Hotel, Santa Clara, CA
Register Now!
More Events »

Newsletter

Linley Newsletter
Analysis of new developments in microprocessors and other semiconductor products
Subscribe to our Newsletter »