Design work can be approached in a number of different ways. But when it comes to standards-based designs for systems centering on wireless connectivity or multimedia, it makes more sense than ever to consider starting at least some parts of the process from the algorithmic level.
With most communications protocols embodied largely in algorithmic functions, this is the ideal place from which to begin defining system implementation and partitioning. Indeed, EDA vendors who traffic in high-level synthesis and verification tools agree.
“The extreme high-level design that we see involves algorithmic tools from vendors like the MathWorks,” says Brett Cline, director of marketing at Forte Design Systems. “If you think about the key algorithm for your next HDTV design, the algorithm can process the data, but you have to decide on the format for implementation. These decisions can pertain to hardware or software or both.”
According to Cline, that’s a major reason for embarking on algorithmic-level work. Designers have found that doing this kind of partitioning and exploration of implementation options helps them demonstrate that their algorithms will process data properly.
According to Ken Karnofsky, marketing director for signal processing and communications at the MathWorks, algorithmic-level design is seeing an upsurge in interest in areas beyond communications and multimedia. For one, DSP designers are algorithmfocused and use high-level tools to implement their computationally intensive algorithms in hardware or software so they perform quickly and efficiently.
But algorithmic design also sees considerable use in the design of feedback-control systems, according to Karnofsky. “In feedback control, where it’s more electromechanical, as well as in automotive applications, algorithmic-level tools are used to model the environment in which the algorithm is used,” he says.
For the control system world, the MathWorks’ tools are extensively used in the design and implementation of control algorithms employed in microprocessors. They’re also used to model the dynamics of the system in which that processor is embedded.
“In an engine control system, our Simulink environment can be used to model how the sensors and systems react to that algorithm. It starts with more simulations and linking to requirements that they may have. This is especially the case in safety-critical applications, whether transportation or medical,” says Karnofsky.
The notion of using algorithmic tools to model the surrounding parts of the system is interesting, because it marries disciplines that are otherwise typically disparate (Fig. 1). “Whether it’s a system-on-a-chip (SoC) design or a board-level system, there is almost always a mixture of analog and digital, software and hardware,” says Karnofsky.
Those broad segments of any given design are typically tackled in silos, with different tool chains and disciplines. “There are issues with design flaws or missed opportunities because those tools aren’t brought together,” says Karnofsky. “Thus, there is lots of interest in taking those algorithms done in Matlab or Simulink to model the surrounding parts of the system.”
Consider the design of communications basestations in which power amplifiers are a large part of the cost. To hold down those costs, designers might be tempted to try lower-cost amplifiers. The tradeoff is that the characteristics of a lower-cost amplifier might not be as linear as those of a more costly one.
As a result, the workaround is to compensate with DSP filtering techniques. Creating the algorithm for that filtering requires a highly accurate model of the amplifier’s RF characteristics. So here’s a case where algorithmic-level modeling can be combined with RF/analog tools to iron out a sticky design issue.
“In the transportation world, real-world testing is very expensive. Plus, you actually have to build at least one of the systems under test,” Karnofsky points out. It can thus be very difficult to pin down design flaws related to the design specifications without very early simulation work such as that afforded by algorithmic flows.
The same is true with communication designs, where the modeling of interaction between analog and digital components at the register transfer level (RTL) can make finding timing issues very complex. Earlier simulation can help weed out the root causes of those timing issues.
Yet another use case for algorithmic design that crosses disciplinary boundaries is in audio systems, such as algorithms for surroundsound systems. Designers may need to conduct listening tests for proof of concept of their algorithm. Furthermore, they may want to do a series of such tests in a single day. Doing so at the algorithmic level will prove a lot more feasible than if the functionality were expressed at a lower level, such as in C code.
Continued on page 2
Taking On Baseband Design
Algorithmic modeling can play an important role not only in the RF realm of communications-system design, but also in the baseband arena. In fact, Agilent EEsof is readying shipment of its SystemVue 2008, a tool aimed squarely at architectural exploration for baseband implementation.
“We call it an ESL (electronic system level) platform because, taken at face value, the platform fits within that description,” says Frank Ditore, Agilent EEsof’s product marketing manager. “But it differs from other ESL platforms in that we’re not trying to do high-level synthesis from a high-level description to RTL or C. Rather, it’s a more refined implementation flow for detailed algorithm modeling that marries RF and baseband together.”
SystemVue 2008 is intended for abstract modeling, both textualand block-diagram-based. It will enable native creation of models in C or Matlab code within a textual debug interface. Those models can then be linked together in a schematic format for simulation of the entire system. Floating-point descriptions of the blocks are refined into fixed-point representations, and RTL is generated.
“We’re not calling this a synthesis product, although our future roadmap has us looking at synthesis technologies,” says Ditore. “With SystemVue 2008, a user describes, for example, an FIR (finite impulse response) filter and it generates RTL for that filter. You can specify a clock rate for the filter. The RTL understands that there’s a clock associated and it generates cycle-accurate RTL. The simulations that you run on generic fixed-point models that we provide versus the RTL are identical. We verify the RTL models against Mentor Graphics’ ModelSim.”
The tool marries RF and baseband together, even though the tool has refined capability for generating RTL, which is more of a baseband flow. Part of Agilent EEsof’s strategy as it rolls out System- Vue 2008 is to offer support to designers working on commercial wireless technologies and employing emerging broadband standards. Thus, Agilent will offer a collection of add-on libraries in the form of verification exploration models.
These are simulation models of the LTE and WiMAX physical layers that designers can use for vector comparison. The exploration libraries build on top of the verification libraries by offering Matlab source code for physical-layer (PHY) building blocks.
“If you need to get into LTE (Long-Term Evolution) chip design without competency in those standards, these libraries give you an algorithmic head start,” says Ditore.
The models run natively within the environment and with Agilent EEsof’s model polymorphic interface. Thus, the designer can move seamlessly between simulation-only models, C models, and other formats. All model types for a given PHY, regardless of abstraction, can be associated with a single top-level symbol.
SystemVue 2008 allows for quick exploration of various baseband topologies and models, weighing the pros and cons of each approach before the design team commits to a hardware implementation path. It facilitates this by adeptly handling Matlab code, C models, and RTL all within a single graphical environment.
The Refinement Process
After beginning at the algorithmic level in Matlab code, designers will typically create a C++ or SystemC model for simulation. Those simulation models must be continually refined during the verification process to meet the quality-of-results (QoR) requirements of their synthesis tool.
“For us, a key application is the verification of this refined C or SystemC model versus the original C model they created,” says Tom Sandoval, CEO of Calypto Design Systems.
After a tentative start over the past few years, designers are now becoming more sophisticated in their use of high-level synthesis (HLS) tools, says Sandoval. “These tools now have specific data types and interfaces. It’s not just throwing your C code at the synthesis tool anymore. We’re now starting to connect blocks together and getting to a SoC description at the higher level. The question is: What do we need to do to refine that C model to get the best QoR?”
The refinement is still being done at block level as opposed to a full-chip refinement. However, it’s now possible to start building up the design hierarchically. Indeed, HLS tools are at the back end of what is becoming a fully realized ESL tool chain (Fig. 2). Such flows can go beyond verification of the algorithm itself, building up system-level models to verify the interfaces between blocks.
HLS Gaining Favor
As for HLS itself, the process of synthesizing RTL from higher-level descriptions of various flavors, primarily C/C++ and/ or SystemC, is maturing quickly. Early use of HLS methodologies involved small functional blocks of 50,000 gates or less. More recently, designers applied HLS methodologies to blocks with hundreds of thousands of gates, and it’s been happening at a broader spectrum of systems houses as HLS technology continues to proliferate.
The growth in HLS adoption is borne out by a recent survey, conducted by Mentor Graphics, of more than 800 EDA users. According to Shawn McCloud, Mentor’s HLS product line director, adoption is well ahead in Japan as opposed to the rest of the world, with 23% of Japanese respondents already using HLS compared to 14% elsewhere. But 23% of non-Japanese tool users say they’re planning on adoption in 2009.
Continued on page 3
According to data from Gary Smith EDA, Mentor’s Catapult C, which synthesizes optimized RTL from ANSI C++, enjoys market-share leadership in algorithmic synthesis. With a growing infrastructure behind it, Catapult C is seeing continual upgrades and improvements. In the 2007b release, Mentor added closed-loop power analysis and optimization capabilities.
By leveraging the original C++ testbench that accompanies the design description, Catapult C uses the design’s switching information to generate highly accurate power estimations at RTL or gate level with any power-estimation tool, such as those from Atrenta, Sequence Design, or Synopsys. Through microarchitecture optimizations, the tool can deliver power savings of up to 30%, the company claims.
With the 2008 release of Catapult C, Mentor raised the bar, adding a number of compelling features. For one, distributed pipeline control can now be implemented, enabling users to configure each block in their designs as independent streaming engines while retaining classic handshaking between blocks (Fig. 3).
Previous versions of the tool implemented pipelining through the use of a top-level pipelining controller, which mandates flushing the pipelining in the absence of data on the input. Distributed control eliminates this requirement, allowing for autonomously pipelined blocks with extremely high performance.
Catapult C reads in pure ANSI C++. But when it comes to building virtual platforms for verification, most designers prefer to work with SystemC. SystemC is the only language that can serve as a wrapper for ANSI C++ even as it facilitates the specification of block-to-block communications and explicit concurrency. It also lets designers specify communication interfaces.
To enable generation of SystemC models, Mentor integrated Catapult C with its Vista Model Builder, which leverages the Open SystemC Initiative’s recently ratified TLM 2.0 standard. Using the ANSI C++ algorithm, the Vista Model Builder creates a compartmentalized SystemC model and automatically generates the model’s transaction-level interface. The resulting model can later be annotated with timing information and, eventually, power-consumption data. Models can be created at a coarse level of detail or at finer levels at the expense of greater simulation time.
Synfora improved its PICO Extreme and PICO Extreme FPGA algorithmic synthesis tools to achieve higher performance and smaller area than the tools’ previous generation. PICO Extreme, an optimizing compiler that transforms a sequential, untimed C algorithm into highly efficient RTL, was enhanced so designers could create and analyze hardware designs more effectively. Moreover, they include QoR improvements in terms of area, throughput, timing, and timing correlation, as well as user feedback improvements.
In the latest version of PICO Extreme, advances in scheduling algorithms enable the compiler to optimize registers in a design. Benchmarks reveal area improvements of 5% to 20%. Sophisticated analysis of variable loop bounds, combined with a new approach to handling early exits from loops, provides performance boosts in the range of 10% to 30% on complex designs.
Achieving high productivity on complex designs requires synthesis tools to provide sophisticated analysis, feedback, and debugging capabilities. Therefore, users can better understand the performance and area bottlenecks in the design. The enhancements to PICO Extreme 08.03, including those to the reporting and feedback capabilities, improve the ability to analyze throughput bottlenecks, provide greater visualization and reporting of the hardware cost, and allow automatic detection and feedback on potential deadlock scenarios.
Creating Virtual Platforms
Moving from the algorithmic phase into a virtual platform for simulation often involves hardware acceleration. Vendors such as EVE have struck up partnerships with numerous ESL companies to improve their handling of models at various levels of abstraction.
“For one, Sony asked us to work with them. We ended up developing platforms where we would take charge of RTL blocks in simulating. Others would be simulated in their environment with their tools,” says Lauro Rizzati, general manager of EVE-USA.
EVE’s ZeBu FPGA-based emulation systems require synthesizable RTL, which is compiled and mapped onto the emulators’ multiple FPGAs. But EVE does offer a means of interfacing higher-level models through transactors based on the SCE-MI standard.
Called ZEMI-3, the transaction-level modeling methodology lets designers quickly create custom transactors for protocols that don’t fall neatly within the bounds of standard types. In doing so, ZEMI-3 technology raises the level of abstraction for hardware debugging. Custom transactors compiled with ZEMI-3 work with various transaction-level environments, such as those from CoWare and Synopsys.
EVE developed the ZEMI-3 technology from the compilation technology it acquired with Tharas Systems. “We optimized their behavioral synthesis for developing transactors,” says Rizzati. “It’s no longer general behavioral synthesis but targets transactors only.” Meanwhile, EVE continues to develop and offer off-the-shelf transactors, such as an AXI Master/Slave transactor and a PCIe Gen 2.0 16x transactor.