Hardware/Software Co-Design Comes Of Age

There once was a time when system design was completely serial. Entire hardware platforms were designed, prototyped, debugged, and virtually completed before any software development began. Of course, such methodologies corresponded to the days of much broader market windows. The very idea of such a quaint approach is enough to make one snicker.

Today, it’s quite different. Those market windows have narrowed to a sliver. Hardware development typically lags far behind software, but no one can afford to wait for hardware prototypes to begin shaking out the system’s drivers, operating system, and bus protocols. It’s become imperative for the software-development process to begin as early as possible so the software and hardware can be verified together.

But how is this to be done when silicon is essentially unavailable, or at best difficult to gain access to? Additionally, in this early stage of a system design cycle, the final specifications are a moving target. There’s the problem of setting up a testbench for the device. Often, the information on which you’ll base debug and performance analysis is incomplete. And on top of all of that, the hardware platform itself includes heterogeneous multicores with complex interconnect, memory hierarchy, and multiple dependent software stacks.

Fortunately for design teams, the means by which software can be effectively verified for a substantially non-existent hardware platform have come a long way. To be sure, one part of the solution has been an increased reliance on so-called platform systems-on-a-chip (SoCs), a genericized hardware architecture that draws from an established, known-good IP portfolio. Such methodologies can help stack the deck in favor of the design team when it comes to software development.

But for those intrepid souls looking to build on a customized hardware platform, hardware/software co-design generally entails assembling a highly abstract model of the hardware platform. There are a number of approaches to this methodology. Additionally, emulation technology has improved substantially in recent years, making it easier to run more clock cycles’ worth of simulation with slightly more detailed models. And, standardization activity of late has helped the industry get on the same page with its modeling.

HIGH-LEVEL DECISIONS An increasingly popular way to attack the co-design problem is to start at as high a level of abstraction as possible. In this way, issues of hardware/software partitioning and which implementation vehicle is best for the algorithms you’re starting from can be handled concurrently.

The MathWorks’ concept of model-based design takes just this sort of approach, providing a means to automatically generate and verify production code for embedded processors. To that end, the company’s family of products comprises a systemlevel flow that begins with design, simulation, and validation of system models in Matlab and Simulink (Fig. 1).

With the Real-Time Workshop and Real-Time Workshop Embedded Coder, design teams can automatically generate production code for embedded processors. These tools provide the C-code generation foundation, and the targets provide target-specific code-generation extensions.

Meanwhile, the MathWorks’ Link products, such as Link for Code Composer Studio or Link for Tasking, enable engineers to test, debug, and verify the embedded software directly against the original executable specification. On the hardware side, using Link for ModelSim, engineers can verify the HDL code to be implemented on an FPGA or as an ASIC against the original executable specification.

MODEL MAKING Hardware/software co-design has become one of the primary applications of electronic system-level tools and methodologies. Several EDA companies have developed fairly sophisticated flows for creating platform models for a given architecture.

Carbon Design Systems provides a catalog of tools for generating cycle-accurate models directly from the “golden” RTL representation of a design. This RTL can be written in Verilog, in VHDL, or in a mixed-language style. Socalled “Carbonized” models are optimized for high performance, running significantly faster in simulation than the RTL itself.

For model validation, Carbon’s flow allows users to verify individual hardware models by debugging them against the original HDL testbench. Models created with Carbon’s Model Studio can be easily integrated into existing reference test suites to verify the cycle accuracy of interfaces. All of the industry-standard simulators can be used, including Mentor’s Questa, Synopsys’ VCS, and Cadence’s NC-Sim.

When it’s time to assemble individual models into a platform, Carbon’s models lend themselves to integration with system-level integration environments such as CoWare’s Platform Architect and others. They’re also easily integrated with legacy RTL IP at cycle level. At this point in Carbon’s flow, users can replace early system models with accurate cycle models compiled from the actual RTL.

Continued on page 2

After assembling the platform, Carbonized models allow analysis and exploration of the hardware architecture itself. They permit use of actual RTL to drive the architectural analysis and to use existing system modeling environments to validate assumptions regarding the architecture. Users can explore the performance parameters of the design implementation and verify that tradeoffs made using high-level system models were worthwhile.

Finally, the flow addresses firmware validation by enabling software debug before silicon is available. Firmware validation in the Carbon flow takes advantage of the accuracy of the Carbonized cycle-level models. It also makes for virtual platforms that are easily deployable to multiple design teams as well as to third-party software developers.

VIRTUAL PLATFORMS By creating a virtual platform from hardware models, several software development issues are addressed. A virtual system gives designers much more control over the system compared to a potentially unstable first-pass silicon prototype. Thus, they operate within a far more deterministic scenario. Plus, they gain the ability to tinker with the platform at will, adding or subtracting functional blocks and/or changing their speeds to determine the effect on the architecture and system performance.

Virtual hardware offers good visibility in terms of memory, processor registers, and device states. When you synchronize the processors, you can synchronize everything at once. Virtualization also affords much more control over system execution. When debugging requires a global system stop, all processors stop simultaneously with no “skid” effect. When one processor is stepped through instructions, others can be made to sit and wait. Cores can be slowed or stopped entirely, communication latencies can be increased, and timing disturbances from breakpoints disappear.

One option for platform creation comes from CoWare in the form of what the company calls its ESL 2.0 toolset. With Co- Ware’s tools, SoC designers can debug and benchmark the platform- level performance of their IP and subsystem RTL at a cycle-accurate level of abstraction (Fig. 2).

CoWare claims a 30% to 50% cycle time reduction through the use of virtual platform technology. The technology accelerates the edit-compile-debug cycle. A virtual platform provides full visibility and controllability of the entire platform, including processor, buses, peripherals, and the environment, and is deterministic.

Yet another advantage of virtual platforms for co-design purposes is the removal of dependency on accessibility of the hardware. Just like a software package, the virtual platform is accessible—worldwide—in a matter of minutes, and thousands of units will have identical deterministic behavior.

Virtual platforms also help software developers sidestep rework that comes as a consequence of an evolving hardware specification. CoWare’s virtual platforms can be created with the same technology used by hardware architects and development teams. As a result, as changes are made, they can be immediately provided to the software developers.

BUILDING AN INFRASTRUCTURE The virtual-platform route has its advantages, but there are also barriers to success. Building a virtual platform can be a laborious process that must be undertaken in parallel with the design process itself. Then there are the issues with interoperability of hardware models among various commercial flows.

Imperas, a startup in the virtual platform arena, made a splash earlier this year with a major technology donation that carries the promise of an open-source infrastructure for virtual platforms. Finding a lack of a broad debugging infrastructure, the company made three technology components freely available through its Open Virtual Platforms (OVP) Web site at www.ovpworld.org as well as at SourceForge.

The first component comprises C-language modeling application-programming interfaces (APIs) for processor, peripheral, and platform modeling that let designers build a platform-verification infrastructure as well as create behavioral and processor models. The second element is an open-source library of models written to the APIs. The models can be obtained as either pre-compiled object code or as source-code files. At the outset, the library comprised processor models of ARM, MIPS, and OpenRISC OR1K devices, with others to follow.

At last month’s 45th Design Automation Conference, Imperas announced a partnership with Tensilica that would allow fast functional, instruction-accurate models of Tensilica’s Xtensa and Diamond Standard processors to run on OVP-based virtual platforms. Specifically, wrapper files enabling integration of the Tensilica processor models are now available for free download at www.ovpworld.org. These models will run with Tensilica’s TurboXim fast functional simulator, which simulates at speeds 40 to 80 times faster than a traditional instruction-set simulator.

Last but not least of the three free components is a free OVP reference simulator that runs processor models at up to 500 MIPS. Known as OVPsim, the simulator comes with a GDB interface for the designer’s debugger of choice. OVPsim can be called from within other simulators through a C/C++/SystemC wrapper. It also can encapsulate existing instruction-set simulator (ISS) processor models (Fig. 3).

Continued on page 3

HARDWARE GETS INTO THE ACT There are situations where hardware can be brought to bear on hardware/software co-design in emulation. An emulationbased approach can greatly aid examination of the behavior of hardware-software interactions due to its ability to run many times faster than simulation and use hardware monitors and traces, coupled with its ability to zoom in on problems.

Further, emulation systems are readily linked into virtual platforms, facilitating full system verification. For example, EVE’s ZeBu series of emulation systems can be used in an ESL emulation scenario when a tool such as CoWare’s Platform Architect is connected to a ZeBu system by means of a bus transactor (Fig. 4).

EVE’s ZeBu emulators enable design teams to execute embedded software above 1 MHz, considered by many design teams to be the minimum acceptable speed to validate software. With ZeBu technology, design teams can debug software by connecting the debugger to the processor in design-undertest (DUT) mode via a JTAG transactor and then stop/start and step through the debugger without losing the connection.

“This cannot be done via a physical JTAG cable/pod,” says Lauro Rizzatti, vice president of marketing and general manager of EVE-USA. The ZeBu family of products also gives design teams full native visibility into all registers and memories, without compiling probes or applying instrumentation to the DUT.

All of these developments permit design teams to execute software, run the debugger, stop in the event of an error, look inside the hardware, and zoom into the problem, which could be a software bug or a hardware bug. By maintaining visibility into the software debugger and into hardware registers/ memories and switching between the two, errors are often tracked down in a relatively short time.

EVOLVING STANDARDS The transactors that underlie EVE’s ZeBu technology are adapted from the Standard Co-Emulation Modeling Interface (SCEMI), which improves high-speed, transaction- level verification between different hardware and software simulation and emulation systems. The SCE-MI standard has been created under the aegis of Accellera, an industry body that drives EDArelated standardization.

Recently, Accellera’s board of directors approved version 2.0 of the SCE-MI standard, which the group says is a step forward in the effort to provide a convenient means of connecting and migrating transactor models between simulation, emulation, and rapid-prototyping environments.

“The Accellera SCE-MI interface attempts to make the underlying hardware/ software interfaces between model execution tools, such as simulators, emulators, and testbenches, as uniform as possible,” says Brian Bailey, chairman of Accellera’s Interface Technical Committee (ITC), which drew up the SCE-MI 2.0 specification. “Version 2.0 also draws on the SystemVerilog direct programming interface (DPI) standard, ensuring that no additional standards are created where adequate ones already exist.”

In addition to a new use model built on a subset of the SystemVerilog DPI, SCEMI 2.0 is also now compatible with the Open SystemC Initiative’s (OSCI’s) transaction- level modeling (TLM) definition. SCE-MI 2.0 maintains backward compatibility with the previous version 1.1. It also adds a streaming interface with data shaping to optimize emulation speed, increased options for model migration from simulation to emulation and vice versa, and more simplified transactor modeling.