Octeon Fusion-M Goes Macro
Offering 16 CPUs, Cavium Enables Large Wireless Base Stations
By Jag Bolaria
Cavium’s new Octeon Fusion-M family should accelerate the company’s success in macro base stations. Octeon processors are already shipping into 3G and 4G macro base stations as well as into gateways in the evolved packet core. These processors often perform control and transport processing. Fusion-M integrates CPUs and DSPs to handle all Layer 1–3 functions for more than 3,600 users. It extends Cavium’s reach from NICs (network transport) to full baseband processing in macro base stations. It also triples the number of users compared with Freescale’s QorIQ Qonverge B4860—the first macro base-station processor to integrate DSPs and CPUs.
Fusion-M is not Cavium’s first integrated base-station processor. The Octeon Fusion CNF7130 was among the initial picocell chips supporting LTE. Consequently, the company won business in Korea through KT, LG U+, and SK Telecom as well as several other operators. These deployments helped it become the leading supplier of 4G small-cell base-station processors. Whereas the CNF7130 topped out at 64 users, the new CNF75xx family will extend the Fusion architecture to 3,600 users, enabling customers to port software from picocell to macro base stations.
Future wireless infrastructure will use heterogeneous networks consisting of macros, small cells, relays, DASs (distributed antenna systems), cloud (or centralized) radio access networks (C-RANs), and Wi-Fi. Depending on the environment, operators will use different combinations of these technologies. In turn, OEMs will address the requirements by reusing software across platforms to reduce development cost. Cavium plans to apply a combination of Fusion-M, software stacks, and new ThunderX processors.
Cavium Extends Octeon Fusion Line
At Mobile World Congress, Cavium announced the Fusion-M CNF74xx for intelligent radio heads in C-RANs as well as the CNF75xx for macro base stations. The first-generation CNF7130 SoC (see MPR 10/10/11, “Cavium Adds DSP for Small Cells”) implements LTE Release 9, whereas the two newer product families implement LTE Release 11; the company says these products will be Release 12 ready. The CNF7130 is built on the Octeon II CN61xx and is thus manufactured in 65nm CMOS. In contrast, the CNF75xx and CNF74xx use the same 28nm technology as Octeon III processors. They increase performance by incorporating more CPU and DSP cores as well as by raising the clock rate.
Because the Fusion chips are based on the company’s Octeon family, they share much in common. All use Cavium’s newest MIPS64-compatible CPU cores, which Fusion supplements with DSP cores. The CNF75xx has up to 16 CPUs running at up to 2.0GHz and 18 DSPs at 800MHz. The CNF75xx supports macro base stations with LTE-Advanced capabilities. Offering 12 sectors at 20MHz, the CNF75xx delivers 1,800Mbps of peak aggregate bandwidth on the downlink and 900Mbps on the uplink. For 3G WCDMA, it simultaneously provides a peak of 168Mbps down and 46Mbps up, with 128 users.
The powerful DSP capacity enables up to 8x8 MIMO, a technology that extends the concept of simultaneously transmitting different data streams in the same frequency band. The CNF75xx supports 24 antennas, which can be configured for 3, 6, or 12 sectors with up to 8x8 MIMO. The sectors can combine with the multiple antennas to implement as many as 24 transmitters and 24 receivers. For example, customers can configure 12 sectors for 2x2 MIMO, 3 sectors for 8x8 MIMO, or any combination in between. This flexibility allows operators to offer multiple service levels at different prices.
Fusion-M Accelerators Boost Performance
Fusion-M supplements the CPUs, DSPs, and PHYs with hardware accelerators for common tasks. Figure 1 shows the major blocks of the CNF75xx, including the CPUs, DSPs, authentication unit, power manager, I/O interfaces, and PHY plus MAC, including accelerators. The PHY and MAC blocks accelerate WCDMA (3G) and OFDM (4G) functions such as fast Fourier transforms (FFTs), WCDMA chip-rate processing, symbol encoding and decoding, turbo coding, and Viterbi coding.
Figure 1. Block diagram of Fusion-M CNF75xx base-station SoC. The device integrates 16 MIPS-compatible CPUs and 18 DSPs along with a terabyte fabric for connecting PHY and MAC accelerators.
The devices include two crossbars: The first interconnects the DSPs, shared data memory, I/Os, power management, and accelerators for the PHY and MAC functions. The CNF75xx has 8MB of shared memory. The second crossbar interconnects the CPUs, L2 cache, and external memory.
The CNF75xx and CNF74xx use MIPS-compatible CPU cores for data-plane packet processing, control-plane processing, and Layer 3 functions such as radio resource management. Fusion-M has Cavium’s newest cnMIPS64-III—the same 64-bit CPU found in 28nm Octeon III (see MPR 2/13/12, “Cavium Octeon III Sizzles at 100Gbps”). It supports the MIPS Release 5 architecture and is faster than previous Cavium implementations.
Leading DSP Count
For the Fusion designs, Cavium implemented complex 32-bit VLIW DSPs based on the ConnX Baseband Engine, a DSP core that uses the configurable Xtensa architecture from Cadence. Like its MIPS CPUs, Cavium’s DSPs employ a custom ConnX design. They perform physical-layer functions such as FFT/IFFT and modulation, which can be QPSK, QAM-16, QAM-64, and QAM-256. Higher-order modulation schemes provide greater data rates but need more DSP performance.
The new CNF75xx and CNF74xx enhance two of the DSPs (cnMBP2 and cnSBP2, shown in Figure 1) relative to the previous-generation CNF7130, and they drop the cnGCP, which performs control functions. Each cnMBP2 has a dedicated 256KB of instruction memory and 128KB of data memory. A 128-bit bus provides single-cycle access to this memory faster and more deterministically than if the DSPs used caches. The cnSBP2 has 64KB of instruction memory and 256KB of data memory.
The CNF75xx integrates 16 cnMBP2 cores and 2 cnSBP2 cores. The cnMBPs handle symbol processing (OFDMA algorithms such as channel estimation, equalization, and demodulation) for both the uplink and downlink. The cnSBP2s are optimized for soft-bit processing tasks such as data interleaving.
The PHY includes several scheduling and coding accelerators. The scheduler works with the CPUs to schedule resource blocks to users in one TTI (transmission time interval), which consists of 1ms at 180kHz. Note that LTE schedules packets in the frequency and time domains, allocating subcarriers on every TTI. Other PHY accelerators perform turbo coding, Viterbi coding, sniffer acceleration, downlink encoding for LTE and 3G, and transmit and receive chip-rate acceleration. Sniffer acceleration is useful for determining interference from adjacent channels and for self-organizing networks (SONs). The PHY also includes a DFE (digital front end) for filtering, DPD (digital predistortion), and crest-factor reduction.
For Layer 2, the CNF75xx calculates checksums, parses packets, and automates retransmission events. It supports IPSec and over-the-air cryptography with acceleration for major algorithms including Zuc, Snow 3G, Kasumi, and AES. On-chip packet-processing engines handle TCP and voice-over-LTE (VoLTE). Both devices integrate an application-acceleration manager for scheduling work to the CPUs and DSPs as well as for assisting with hierarchical QoS. These new components offload traffic management for two-rate three-color marking, shaping, and DWRR (deficit-weighted round robin) scheduling with six hierarchy levels.
The CNF75xx/74xx chips have multiple 12.5Gbps serial interfaces, which can provide the radio interface, backhaul, and interprocessor communications. The radio interface can use six CPRI or JESD204B lanes. For backhaul, the processors increase the data rate and port count to 4x10GbE. Like the earlier Fusion device, they support time stamping, IEEE 1588, and SyncE. The CNF75xx also adds two lanes of sRIO v2.1, which often connects multiple processors in large multisector base stations.
System Design Made Simple
Because the CNF75xx is highly integrated, the base-station hardware design is relatively simple. A typical design will add Ethernet PHYs, a Wi-Fi controller, and RFICs to the processor. One Ethernet port connects to the backhaul network and another provides a management interface. The external RF transceiver chip performs up/down-conversion for receive and transmit channels. Cavium works with Analog Devices and Asahi Kasei (AKM), which provide compatible RF components.
The CNF75xx integrates two PCIe Gen3 controllers for connecting Wi-Fi controllers, and two 64-bit DDR4 channels with ECC protection link to external memory for code and for packet buffering. Customers can use the sRIO interface to connect multiple devices and further scale the performance for more users and greater bandwidth in macro base stations. To attach other peripherals, the CNF75xx/74xx chips have a USB controller and miscellaneous I/O ports (SPI, eMMC, CompactFlash, UART, and GPIO).
Cavium Reaches for Radio Clouds
The intent of the C-RAN was to centralize baseband processing for multiple base stations, eliminating the need to perform this processing in each cell. In the initial design, a remote radio head (RRH) replaces the base station and is supported by central baseband units (BBUs), which connect using the Common Public Radio Interface (CPRI). Central baseband processing can save resources by flexibly allocating them wherever they are most needed: stadiums on game day, transit hubs during rush hour, and so on. The challenge with this approach is the high bandwidth required between a simple RRH and the central BBU, called fronthaul. This bandwidth requires a high-quality fiber connection, which is often unavailable. Compression technologies can reduce the bandwidth requirements; the Small Cell Forum is looking at repartitioning the RRH and BBU to further reduce the fronthaul data rate.
For the repartitioned RAN, Cavium offers the derivative CNF74xx processor, which supports MAC-PHY splits. It provides the physical-layer functions (such as modulation, coding, FFT, and MIMO support) for LTE-Advanced, plus eight CPUs for smart RRUs (remote radio units). The CNF74xx allows the PHY and some selective portion of the time-critical Layer 2 function to reside in the RRU, and it allows the intelligent RRU to include the Layer 1 PHY, Layer 2 scheduler, and radio link controller. The upside of increasing RRU capability is a lower fronthaul rate, but the downside is less opportunity for resource reuse at the central BBU. Octeon Fusion-M enables customers to partition a minimum portion of the scheduler at the RRU and the remainder centrally, thereby optimizing the fronthaul rate and BBU reuse.
Figure 2 shows a C-RAN configuration using the CNF74xx in the RRU and Cavium’s ThunderX (see MPR 6/9/14, “ThunderX Rattles Server Market”) in the central BBU. Multiple RRUs or small cells connect to a central server through an Ethernet switch or router. In this case, Ethernet replaces CPRI at lower data rates, enabling use of copper as well as fiber media.
Figure 2. Cloud radio access network (C-RAN) using Cavium products. The CNF7xxx enables full or partial base stations at the remote radio sites, and ThunderX supports central baseband processing.
Using the CNF74xx, one customer could place the scheduler and MAC at the RRU, whereas another might also add the RLC (radio link control) to the RRU. The remainder of the baseband function would execute on the central ThunderX server (BBU). OEMs can use ThunderX to implement virtual base stations centrally. The ThunderX server supports multiple C-RAN controllers, which can each map to one or more RRUs. Cavium is currently performing proof-of-concept trials at a Tier One operator in the U.S. C-RAN systems built around Octeon Fusion-M and ThunderX, however, are unlikely to ship before 2017.
Octeon Fusion-M: In a Class of Its Own
Table 1 contrasts Cavium’s CNF75xx with the largest-capacity integrated base-station processors from Freescale and Texas Instruments. All of them support 4G standards (LTE and LTE-Advanced) plus multimode operation with existing 3G user equipment. The CNF75xx, however, handles three times as many users as the competing products.
Table 1. Comparison of integrated base-station processors. Fusion-M leads in bandwidth and number of users, but its availability lags by more than a year. (Source: vendors, except *The Linley Group estimate)
Supporting 12 LTE sectors, 3,600 users, and 4x10GbE backhaul, the CNF75xx is truly in a class by itself. Using carrier aggregation, it can deliver 1,800Mbps of bandwidth on the downlink and 900Mbps on the uplink. For backhaul, it offers twice as many 10GbE ports as the B4860. TI lags with only four GbE ports. The CNF75xx’s 10GbE fronthaul is important for C-RAN deployments. We expect Cavium to announce scaled-down and lower-price versions of Fusion-M that will compete more directly with the Qonverge B4860 (see MPR 3/19/12, “Freescale’s Qonverge Goes Macro”) and KeyStone II TCI6636 (see MPR 4/2/12, “TI Boosts Base-Station Processors”).
To support the larger number of users, theCNF75xx integrates four times as many CPUs and more than twice as many DSPs as its competitors. Cavium is also targeting the fastest clock rate in this group for its MIPS CPUs. If it hits that target, the CPUs will easily beat TI’s slower Cortex-A15 cores and will run neck-and-neck with Freescale’s four Power e6500 cores, which run at 1.8GHz but can execute more instructions per clock cycle.
The CNF75xx’s DSPs operate at 800MHz compared with the more-powerful 1.2GHz DSPs from Freescale and TI. Cavium has disclosed little about the design of these cores, and independent benchmarks are unavailable. One caveat is that all the chips in this group are in production except the CNF75xx, which is scheduled to sample in 3Q15. We estimate production will start in 2Q16. Additionally, we expect the CNF75xx to consume more power than its competitors.
Cavium Poised to Gain Share
Octeon Fusion’s powerful combination of high-performance hardware and production-ready software was an instant hit for small-cell base stations. In addition to winning business at Korea Telecom and SK Telecom, Cavium is in trials with more than 20 operators. In 2014, the company claims to have shipped around 100,000 units into these and other operators. It is the leading supplier to the nascent LTE picocell market (excluding residential femtocells) and will need to replicate its success in the enterprise market.
The second-generation Fusion-M CNF75xx enhances the first-generation Octeon Fusion in several ways, doubling the number of DSPs and CPUs and boosting the clock rate to support more traffic capacity and more users, respectively. The new device changes the architecture by implementing a terabyte crossbar that interconnects the non-CPU blocks. The CNF75xx targets macro base stations that have traditionally used discrete components. Cavium can take the lead as a segment of the macro market adopts integrated base-station processors. Several leading OEMs, however, continue to develop ASICs for their macro base stations.
The company is using its entire portfolio to target C-RAN designs. It can pursue the traditional C-RAN by allowing customers to centrally stack base stations using its CNF75xx or Octeon processors. For the emerging smart remote radio units, Cavium is the first vendor to offer a solution: its CNF74xx can combine with its ThunderX server chip for baseband processing. The company has not disclosed what software it will offer for ThunderX, but it must offer C-RAN software stacks—even if only as a proof of concept. In this segment, it will compete against Intel, which is the leading C-RAN proponent and is using Xeon processors for the BBU.
Cavium’s greatest strengths are the breadth of its solutions, from small cells to large macro base stations, and the ability to redeploy its Octeon technology across multiple product lines. Consequently, customers can easily port their software across a range of base-station platforms. In fairness, Freescale has a similar strategy, its products were available earlier, and it has more base-station market share. Cavium’s new Fusion-M products stand out by supporting more sectors and users in an integrated processor.
Price and Availability
Cavium plans to sample the Fusion-M CNF75xx and CNf74xx in 3Q15. We expect production to start in 2Q16. The company has not disclosed pricing, but we estimate $500 for the full-featured product. For more information, access www.cavium.com/newsevents-Cavium-Introduces-OCTEON-Fusion-M.html.