Wireless Systems Design

C-Code Algorithms Infiltrate Hardware

This algorithm-to-tapeout synthesis tool performs tradeoffs between wireless software and hardware implementations.

Few areas of embedded design are more challenging than the development of mobile wireless products. In this arena, designers must carefully balance overall performance issues with power consumption and time-to-market pressures. The tools that can assist in this delicate balancing act are among the most sought-after resources by wireless developers.

Building on research that was originally conducted at Hewlett Packard Labs, a new company known as Synfora now claims to have the first true "algorithm-to-tapeout" synthesis technology. Called Program In Chip Out, or PICO, this technology promises to greatly reduce design risks by enabling the early exploration of architectural design alternatives. Specifically, PICO will create efficient hardware from compute-intensive, algorithmic C descriptions. This early exploration of "what-if" performance and power scenarios allows wireless designers to find the optimal mix of hardware and software implementations.

Tradeoff analysis between system performance and power consumption is especially critical in the design of algorithmic-intensive applications. These applications are abundant in wireless systems, which encompass everything from digital-baseband processing to MPEG-4 coding and decoding. Although compute-intensive algorithmic applications are originally constructed in C-coded software, performance and power issues often mandate a hardware solution.

The company's first commercialization of PICO technology is called PICO Express. This tool takes compute-intensive blocks of algorithmic code (typically representing items like Viterbi decoders and data filters) and accelerates them in hardware. The actual output from PICO Express is the automatic generation of the register-transfer-level (RTL) code that's needed to build hardware accelerators. This hardware interfaces directly with the host processor (typically but not restricted to an ARM core). In doing so, it implements the compute-intensive functions of the overall algorithm.

Designers have full discretion in the selection of which blocks of code should be moved to hardware. Once those blocks are selected, PICO Express can perform an analysis—called a space walk—of the selected code to examine power and performance tradeoffs. For example, power consumption of the selected algorithmic block can be checked throughout a range of clock frequencies and performance-throughput targets. The tool can then output a variety of RTL implementations for the designer's final selection.

Along with the RTL description, PICO Express provides a synthesis script, test bench, and software-driver code. It therefore enables the integration of the RTL into the existing system-on-a-chip (SoC). The tool provides checking and validation of the generated RTL code, including bit-accurate C simulation to detect overflows. A program verifier ensures that the design creates a highly efficient hardware accelerator. One of the main testing activities is a perturbation test, which validates that the structures added will not cause functional or timing failures.

Once PICO Express generates the RTL driver code, it can be compiled into executable code. The tool then creates the completely verified hardware accelerator, called the Pipeline of Processor Arrays (PPA) architecture (SEE FIGURE). Included in this architecture is the bus interface between the PPA and the primary processor core.

The interface between the accelerator and the core is critical because bus bandwidth is a very limited resource. PICO Express analyzes the bus data access and determines how it can best cache data to reduce constant data loading and storing activities. This effort can significantly reduce traffic to the host processor. Furthermore, the tool creates a streaming interface pipeline. It allows processed data to be passed between several accelerators without ever going back to the system bus or processor.

Gate counts for the resulting hardware accelerator depend greatly upon the function of the original algorithmic block. Gate-count values can range from 35 to 1000 kgates. Typically, a series of accelerators will be needed for more demanding designs, such as a CDMA modem.

PICO Express is available immediately. It is priced at $125,000 for a design-project license. The first customer silicon using PICO Express is expected this month.

Synfora, Inc.
2465 Latham St., Suite 100, Mountain View, CA 94040; (650) 314-0500, FAX: (650) 314-0501, www.synfora.com.

TAGS: Mobile
Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish