Electronic Design
Heterogeneous System Architecture Changes CPU/GPU Software

Heterogeneous System Architecture Changes CPU/GPU Software

AMD’s Heterogeneous System Architecture (HSA) hardware framework provides Heterogeneous Uniform Memory Access (hUMA) to CPUs and GPUs (Fig. 1). It radically changes the way CPUs and GPUs will interact at the software level.

 

Figure 1. AMD’s Heterogeneous System Architecture (HSA) allows CPU and GPU cores to use the same virtual memory address space so data does not have to be copied for use by different types of cores. This greatly simplifies programming and considerably improves speed.

In the past, GPUs were implemented as separate entities with their own memory and a communication channel to the host processor. The host would use the channel to move data and GPU code into the GPU memory. Initially, the GPU was used only for driving displays. The software was a closed system that only the GPU vendor could access.

Eventually the GPU vendors opened up the GPU for computational chores because the number of cores and the GPU architecture could sometimes improve speed by as much as two orders of magnitude. Not all applications show this much improvement, but many provide significant advantages over CPUs.

In Operation

GPUs have moved from their display-only chores into computation-only applications or mixed environments where the GPU handles display and computation chores at the same time, much like how a CPU handles multitasking.

Download this article in .PDF format
This file type includes high resolution graphics and schematics when applicable.

Programming a GPU can get tricky because of its architecture, which synchs a number of cores in a more advanced single-instruction, multiple-data (SIMD) configuration. Higher-level programming frameworks like NVidia’s CUDA (see “Is Your Personal Computer A CUDA-Enabled Speed Merchant?” at electronicdesign.com) and OpenCL (see “OpenCL 2.0, OpenGL 4.4 Officially Released” at electronicdesign.com) have made the job significantly easier by simplifying the movement of data between the CPU and GPU memory.

The movement of data from one memory to another has a range of impacts including address translation issues. The copying and translation issues disappear when HSA is used because the CPU and GPU share the virtual address space.

HSA supports existing software development frameworks like OpenCL that are currently used on GPUs and CPUs. This makes migration to HSA platforms easier. There are several ways to do this, including simply using the CPU and GPU, as in the past. It doesn’t take advantage of HSA, but it will work. Recompilation to take advantage of HSA could significantly improve performance.

Software development becomes more interesting when compilers and operating systems have native HSA support. Popular C/C++ compilers like gcc and LLVM will support HSA. Other compilers will also support HSA, including Java.

The OpenJDK Sumatra Project is designed to put Java on top of HSA, generating a combination of CPU and GPU code depending upon the application (Fig. 2). HSAIL is a virtual machine for the GPU. It has a byte code designed to mimic GPU functionality but at a generic level, just like the Java Virtual Machine (JVM) is used for CPUs. The HSAIL Finalizer generates native GPU code from the byte code generated by compilers like gcc or LLVM. This maintains Java’s portability while allowing a compatible Java application to run on a range of CPU/GPU combinations.

 

Figure 2. Java will eventually have seamless HSA support via the Sumatra project, which will generate code for CPU and GPU cores as necessary.

HSA also includes hQ, which allows CPU and GPU task management. It enables task running on each platform to invoke and interact with tasks running on the other platform. Also, the AMD CodeXL tool suite provides GPU debugging as well as CPU and GPU profiling. It is currently available as a Microsoft Visual Studio plug-in and as a standalone application running under Windows or Linux.

On The Market

AMD’s Kaveri desktop APU will be the first platform to include HSA support. Kaveri chips will be available in 2014. The Bolt C++ Standard Template Library (STL) is optimized for the HSA heterogeneous computing platforms. Bolt C++ STL will let C++ programmers utilize an HSA APU without resorting to the more complex OpenCL approach.

HSA is not specific to a CPU or GPU architecture. Not all vendors that will be building CPU/GPU SoCs will adopt it, but some versions will be built around Arm CPUs and other GPU architectures. The HSA Foundation that was formed to manage the architecture includes major chip vendors such as AMD, Arm, Texas Instruments, Samsung, and Qualcomm.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish