What Will You Do With 1 TFLOP Of Double-Precision Power?

July 24, 2008

Don’t look now, but you may have a supercomputer on your desk. It’s hiding in your video card. While it won’t make your word processor faster, it may improve the transcoding speed when you’re moving movies to your mobile Internet device.

William G. Wong

Intel and AMD have been pushing multicore in the 64-bit x86 realm with only four-core chips at this point. Intel’s 80-core Polaris is designed to push the envelope, but AMD and NVidia have other ideas, at least when it comes to stream computing.

Multicore has flourished in graphics processing units (GPUs). Until a few years ago, GPUs literally were black-box systems designed to improve gaming and deliver fast updates for CAD packages and medical applications. The closest a programmer got to the GPU was the video device driver.

That was then. Now, NVidia and AMD/ATI not only have opened up their precious GPU, they also have delivered an impressive collection of software and application programming interfaces (APIs). We’re now into third-generation boards targeted specifically at areas including stream computing.

NVidia’s C1060, designed for parallel computing, lacks a video output (Fig. 1). Still, the board often will be used for video preprocessing chores such as image analysis and ray tracing with another video card providing rendering services.

The C1060 uses the same architecture as NVidia’s GForce video adapters and packs 4 Gbytes of memory for its 240 cores. It also uses the same SIMT architecture as the GForce, just with many more cores. And, the C1060 does double-precision while the GForce products are single-precision, for now.

AMD’s FireStream 9250 is based on the company’s double-precision RV770 chip, which also is found in AMD’s Radeon HD 4850 (Fig. 2). It has 160 cores that normally are used as shaders when tasked with graphic chores.

SOFTWARE SUPPORT These latest boards target high-performance computing applications, though the software used to create applications is equally applicable to GPUs in video boards. While the video boards may have to perform double duty by running a parallel application and displaying a windowed desktop, the amount of performance available is often sufficient to handle both.

The first step was to provide runtime libraries that delivered array manipulation services. Yet the real power came when programmers were able to write applications than ran on the GPU. NVidia’s Compute Unified Device Architecture (CUDA) and AMD’s FireStream software development kit (SDK) can do this, and they’re available as free downloads. A forthcoming version of CUDA will even generate code that runs on non- GPU platforms such as multicore x86 processors.

The C code used with these GPU tools is augmented to explicitly annotate the parallel aspects of the programs. Developers will need to try out this approach, and not all applications can benefit from the tools and GPUs.

About the Author

William G. Wong | Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form.

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below.

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence.