Coprocessor Synthesis Offloads Software Tasks
For experienced wireless designers, the migration path from software to hardware is a familiar one. After all, newer technologies are first implemented in software to gain efficiency and expediency. Examples include Java, multimedia (MPEG-4), and 3G base-station baseband processing. By initially implementing such technologies in software, it's possible for new technologies to first be proven on existing platforms. They can then be introduced into the market as quickly as possible. In other words, once a technology is shown to be commercially viable, the software implementation can be moved to a hardware platform for increased performance and lower power consumption. Usually, such a platform takes the form of an FPGA, coprocessor (accelerator chip), or a programmable DSP. Eventually, most implementations end up as customized ASICs.
A new product is now infiltrating this traditional design flow. Called Cascade, it hails from Scottish startup vendor CriticalBlue. By offering a solution between the two extremes of software-only simulation and customized hardware, CriticalBlue is demonstrating a different approach. Through a process dubbed "coprocessor synthesis," the Cascade toolset uses the existing application software to synthesize a new hardware coprocessor. That coprocessor accelerates specific selectable software tasks.
This approach represents a departure from today's synthesis processes. Generally, they start from a high-level description of the hardware. Instead, Cascade starts from the compiled object code of the software application.
By analyzing application software on the host microprocessor, Cascade allows the designer to automatically synthesize a hardware coprocessor. The key to this method is the way that it is accomplished. The optimized coprocessor architecture and associated micro-code are derived directly from the application software. The resulting coprocessor software can then be offloaded from the host processor. Aside from freeing up execution time, this offload boosts system performance.
This approach offers a myriad of benefits. By eliminating the need to design new hardware, for example, months and even years can be saved from the total design effort. It also allows more of the design to remain in software. Here, changes and modifications can be more easily accomplished. This capability is particularly important for derivative product designs, in which significant amounts of embedded code already exist.
The Cascade design flow is non-intrusive in that it complements standard development approaches. To realize this benefit, the tool suite starts with compiled object code that has been produced by any standard embedded-software tool. It outputs cycle-accurate C-models plus synthesized HDL (see figure).
Essentially, a designer inputs application code into the Cascade tools along with performance estimates that were obtained from a third-party profiling tool. The tool suite analyzes these inputs before outputting a customized coprocessor design. Initially, this design is output as a model.
The designer has full control over which applications are to be offloaded from the host processor. Using the coprocessor model, designers can figure out which tasks would be best offloaded.
Tradeoff analysis can be rapidly performed using the cycle-accurate C-models. This analysis determines the tasks that would be best offloaded to a coprocessor. Once all of the tasks are identified, Cascade synthesizes the appropriate coprocessor architecture in RTL form—either Verilog or VHDL. The resulting coprocessors typically consume between 20,000 and 100,000 gates.
The Cascade tool suite is currently under beta testing. The production version is scheduled for release in the first quarter of next year. Pricing for the Cascade tool suite will be based on an annual subscription-fee, project-based model ranging from $35,000 to $200,000.
CriticalBlue The Scottish Microelectronics Centre, W. Mains Rd., Edinburgh, EH9 3JF, U.K.; +44 (0)131 650 7459, www.criticalblue.com.