The transistor count keeps following Moore’s Law. But unless you’re at the low end of the 32-bit spectrum or below, multicore is the only alternative to more powerful platforms. It addresses the clock and power issues that play a much bigger part in designing new chips. Without multicore, the drive for more power-processing systems would come to a grinding halt.
This has led to a plethora of multicore designs, like Freescale’s QorIQ P4080. This eight-core power architecture chip is designed for for communications applications (see “Multicore And More”). It’s chock-full of high-speed serializers-deserializers (SERDES) for communication chores using interfaces such as Ethernet, XAUI, and Serial RapidIO. It also incorporates pattern matching and encryption support that runs at 10G speeds.
Intel’s Core i7 and the Xeon Nehalem-EX is where mainstream computing goes for higher core count (see “Match Multicore With Multiprogramming”). Each of the eight cores supports Intel’s HyperThreading (HT) architecture, which doubles the number of active threads on each core. The Core i7 also incorporates 8 Mbytes of SmartCache, a three-channel memory architecture, and TurboBoost.
The Intel and Freescale chips can operate in SMP or AMP modes with shared memory support. They easily support applications that can be multithreaded, but the number of cores pales compared to other architectures like Nvidia’s T10, which has 240 cores (see “SIMT Architecture Delivers Double-Precision Teraflops”).
It’s All About The Software
Nvidia opened its GPU to be a general-purpose computing platform. Using parallel programming frameworks like Nvidia’s CUDA and the Kronos Group’s OpenCL, programmers can access the underlying hardware in a relatively generic fashion. OpenCL runs on a range of multicore platforms, including x86-based systems. Intel Thread Building Blocks is a comparable environment that takes advantage of Intel platforms, though it is a cross-platform solution as well (see “Parallel Programming Is Here To Stay").
Freescale pushes the software framework to the application level with its VortiQa network processing software. It takes advantage of the QorIQ multicore chips and is tuned to deliver communication services such as VPN connections and intrusion prevention systems (IPS) as well as the usual firewall and NAT support.
Matlab from the MathWorks highlights that range of programmer interaction with multicore that many frameworks must contend with. At the highest level, Matlab delivers multicore support transparently. This is very effective since applications typically utilize a good deal of array manipulation. Large datasets are common.
Programmers needing more control can use programming statements like parfor that bring a multithreaded approach to data. Developers need to deal with new concepts such as distributed arrays and single-process multiple-data (SPMD) support. SPMD differs from Nvidia’s SIMT and the more generic SIMD.
Things get complex when job scheduling comes into play as well as explicit interfacing to MPI. Yet the range of options gets smaller when a specific environment like Matlab is considered. Unfortunately, multicore doesn’t keep up with Moore if parallelization doesn’t work.