Take Advantage Of Multicore Platforms

Multicore is everything these days, from laptops to servers. Getting started programming a multicore PC is as easy as downloading Intel’s Thread Building Blocks (see “\\[\\[threads-make-the-move-to-open-source16538|Threads Make The Move To Open Source\\]\\]”). It’s just one of the many frameworks designed to take advantage of the multicore hardware that is readily available.

Even C++ is making it easier to handle multithreaded programming. The new C++ standard in the works (C++0x is the unofficial name) adds a new thread library patterned after Java. This will finally establish a definition for the multitasking memory model, which was otherwise the bane of portability with each implementation being different. Even thread local storage will be part of the plan.

Of course, you can always stick with Java for multithreaded work. It’s been that way from the start. As with other platforms like Windows, the hardware keeps getting better to the point where almost anything will run Java, and lots of cores make it easy to handles lots of Java.

Or give Scala a try (see “\\[\\[if-your-programming-language-doesn-t-work-give-sca|If Your Programming Language Doesn’t Work, Give Scala A Try\\]\\]”). It is one of many programming languages that are designed to make parallel programming easier, and parallel programs are the way to take advantage of multicore hardware.

On the hardware side, there are even low-cost multicore options for embedded applications like XMOS’s XS1-L1 and quad-core XS-G4 (see “Interconnects Matter”). Each core runs eight threads. These devices are programmed using C, C++, and XC, a C variant that takes advantage of the hardware.

FGPA LETS YOU ROLL YOUR OWN Multicore systems and clusters are off-the-shelf items, but they aren’t the only way to go with multicore. FPGAs offer a custom alternative, as high-capacity FPGAs can handle multiple soft-core processors with two added advantages— they’re customizable, and they can be surrounded by any logic a designer can dream up.

Every FPGA vendor has its own softcore processors in its portfolio, and they aren’t the only options. Designing your own works too. But there are standard soft cores from the lowly 8051 to 32-bit standards such as Freescale’s Coldfire (see “\\[\\[cold-dense-and-gratis-mcu-core-targets-fpgas19558|Cold, Dense, And Gratis MCU Core Targets FPGAs\\]\\]”) and Arm’s Cortex-M1 (see “\\[\\[fpgas-pushing-mcus-as-the-platform-of-choice19149|FPGAs Pushing MCUs As The Platform Of Choice\\]\\]” ). The Cortex-M1 is compatible with the Cortex-M0 (see “\\[\\[32-bit-architecture-changes-the-power-game-for-mic|32-Bit Architecture Changes The Power Game For Micros\\]\\]").

Working with FPGAs is getting easier and less expensive with platforms like Altium’s NanoBoard NB3000 (Fig. 1). The entire system, including the Altium Designer development software, only costs $400. It can handle FPGAs such as the Xilinx Spartan-3, the Altera Cyclone II, and the Lattice Semiconductor LatticeECP2. Just pop in the appropriate module.

The FPGAs have access to a 240-by-320 thin-film transistor (TFT) LCD panel, a high-quality stereo subsystem, USB 2.0, and fourchannel 8-bit digital-to-analog converters (DACs) and analog-to-digital converters (ADCs). They also offer 1.5 Mbytes of SRAM, 64 Mbytes of SDRAM, 16 Mbytes of flash, and 4 Mbytes of serial peripheral interface (SPI) serial flash. This provides plenty of support for multiple cores or simpler projects if multiple cores are in the mix.

USE THAT VIDEO CARD Graphic processing units (GPUs) like Nvidia’s GeForce GTX 295 (Fig. 2) are additional sources of computing power. This dual-chip platform sports 480 cores that are accessible via CUDA, a parallel programming platform from Nvidia (see “\\[\\[match-multicore-with-multiprogramming21341|Match Multicore With Multiprogramming\\]\\]”). The platform also supports the Khronos Group’s OpenCL open, royalty-free standard for cross-platform, parallel programming frameworks.

Nvidia’s platform has a singleinstruction, multiple-thread (SIMT) architecture. It is the same architecture used to implement the Tesla T10 S1070 (see “\\[\\[simt-architecture-delivers-double-precision-terafl|SIMT Architecture Delivers Double- Precision Teraflops\\]\\]”). The T10 is designed only as a compute engine, but the GeForce boards can do double duty performing their graphics tasks and rendering images to one or more screens while also handling application computing chores. These video adapters are silently running in many PCs, just waiting to crank through large arrays of data.

Multicore is simply getting more ubiquitous. Frameworks and programming language enhancements can help take advantage of the hardware, but programmers need to use these features to utilize the improvement in computational power. Platforms such as FPGAs and GPUs are giving developers even more choices.

ALTIUM • www.altium.comKHRONOS OPENCL • www.khronos.org/openclNVIDIA • www.nvidia.comTHREAD BUILDING BLOCKS • www.threadbuildingblocks.com