The demands of multimedia are pushing hardware to extremes, requiring advanced architectures and support for multimedia single-instruction, multiple-data (SIMD) instructions. DSP and graphics support also are part of the mix. Yet ARM's Cortex-A9 and MIPS32's 74K 32- bit cores both break the 1-GHz barrier.
Chips based on these architectures will wind up in high-volume applications such as residential gateways with Voice over IP (VoIP) support, digital TV applications like set-top boxes, gaming, and automotive infotainment.
Arm Cortex-A9 Core
The Cortex- A9 targets the top end of ARM's product line (Fig. 1). It can utilize ARM's multicore architecture, which has been used with the ARM11 (see "ARMv7 Makes A Move To Multicore" at www.electronicdesign.com, ED Online 16156). Up to four Cortex- A9 cores can be used with this approach.
The core supports the Thumb-2 instruction set and the Jazelle Java hardware acceleration implementation. As with most core designs, custom instructions can be added. Standard options such as SIMD support can be added as well. In this case, ARM's Neon advanced SIMD support takes advantage of the DSP enhancements available in the Cortex-A9, as it's often combined with the Mali graphics processing unit.
The Cortex-A9 architecture is designed to maximize instruction parallel execution. Its eight-stage pipeline can handle outof- order instruction flow using a six-entry queue. The instruction dispatch stage can forward up to four instructions per clock cycle. Processing unit pipelines execute independently.
Two AMBA 3 AXI external bus interfaces are used to handle this level of throughput. The use of ECC RAM indicates the need for high reliability given the Cortex-A9's target applications.
Debugging support is very important in a multicore solution. ARM addresses this with its CoreSight debug and trace capability, which spans the entire system-on-a-chip, including multiple ARM processors, DSPs, and intelligent peripherals.
ARM has a range of PrimeCell components, such as the interrupt and cache controllers, which can be combined to form a system.
MIPS32 74K Core
MIPS has a similar complement of peripherals that can be tied to its MIPS32 74K core, including instruction extension with its CorExtend support (Fig. 2). The MIPS32 74K uses a 17-stage pipeline, also with out-of-order dispatch. Likewise, its multiple execution units operate in parallel. Its stall-free ALU is linked into the DSP-style support of the multiply/divide unit instead of having a completely separate execution unit.
An advanced prediction unit enables all of this parallel, out-oforder processing to occur. Most 32-bit solutions in this range, like the Cortex-A9, implement this approach in some fashion. In MIPS's case, though, the 74K uses dual, independent eight-entry instruction queues. MIPS keeps three branch history tables to handle prediction more efficiently. Also, a return stack is maintained in hardware.
The MIPS architecture's inclusion of shadow registers permits zero overhead context switching. The DSP support adds three additional pairs of accumulator registers. Low-power operation is accomplished using a range of approaches, including the use of fine-grain clock gating. Each major block can be clocked independently as well. For example, the dualpipeline, asymmetric dual-issue floating-point unit can run at its own clock frequency.
The MIPS JTAG debugging architecture provides cross-CPU breakpoint support. The debug controller is chainable for multi- CPU management. Virtualization support is key for the efficient handling of virtual-machine managers (VMMs). VMM support isn't exclusive to high-end 64-bit x86 platforms.
Head to head
The MIPS32 74K and Cortex-A9 are essentially the same in terms of performance and target audience. They're rated at 1.8 DMIPS/MHz and 2.0 DMIPS/MHz, respectively. Both can be configured with different size and complexity caches, and they can be surrounded with a range of standard peripherals. This silicon ecosystem often makes the difference when creating a system design.
Both cores utilize a TSMC 65-nm generic process. MIPS is the first company out of the chute with implementations in the gigahertz range. The Cortex-A9 will likely start out at about 500 MHz, climbing quickly to 1 GHz, given the demand for higher performance.
The two cores are close in size as well. The 74K core is about 1.7 mm2, while the Cortex-A9 is about 1.5 mm2. The difference is minor, and the final size of a chip will vary significantly based upon other factors such as cache size, peripherals, and the number and width of buses employed. Designers have a wide range of options with both architectures.
The companies take a similar third-party approach to software. They also include their own C/C++ tools. ARM has its RealView compiler suite, while the MIPS Software Toolkit combines open-source tools that have been optimized for the MIPS platforms.