Engineers have always relied on hardware to speed up simulation. Initially, it was considered a luxury used only by elite verification teams. But with 90-nm processes soon to go mainstream and system-on-a-chip (SoC) gate counts closing in on 100 million, hardware-assisted verification is now a necessity. Certainly, those engineers trying to functionally verify a 50-Mgate design know that a software-only simulation of the full chip at 5 Hz or less is a losing proposition, especially given time-to-market pressures. So unless you've got several weeks or months for software simulation, today's ever-larger SoCs need hardware, whether it's to accelerate software simulations, emulation, or both. Moreover, leading-edge trends in verification, such as the use of transactions and assertions, are steadily creeping into the mix.
Broadly speaking, there are two approaches to hardware-assisted verification. One, targeted primarily at design teams, involves systems based on FPGAs. The other concerns systems based on custom processor architectures. These are larger systems aimed at verification engineers. Both have their places in the verification process, and both come with pros and cons.
According to Gartner Dataquest, the market for "design team acceleration and emulation" systems will grow rapidly over the next few years. Gartner's latest figures show that market, which primarily consists of FPGA-based systems, topping out at $119 million in 2008. The market for "verification team acceleration and emulation," comprising the larger, costlier custom-processor-based systems, will likely re-trench a bit in coming years. That's because, according to Gary Smith, Gartner's chief EDA analyst, "power users are pushing verification back down to the design team."
One study by Smith showed that verification, outside of typical test planning and regression testing, can waste as much as 25% of a verification team's resources. "That's because the verification team spends a lot of time chasing 'don't cares,'" says Smith. "The design team knows the difference between a 'don't care' and what really needs to be verified."
The rapid growth in complexity at the register-transfer level (RTL) is driving many design teams to rely on simulation acceleration. Another factor is the growing amount of embedded software content and the need to validate that software before engineering samples of the silicon are ready. Use of hardware acceleration and emulation can mean dramatic advances in hardware/software co-verification long before silicon is done (Fig. 1).
FPGA-based systems have the advantage of being relatively inexpensive compared with machines based on custom architectures. They're generally smaller and more readily deployed, and they can fit on desktops or at the side of a workbench. Hence, they tend to be aimed at design teams. Some vendors, such as EVE, claim speeds of 20 to 30 MHz.
FPGA-based systems also have limitations and drawbacks. The design being simulated must be mapped to the system's array of FPGAs. Anyone who has attempted this task manually knows that it's quite time-consuming and can ultimately frustrate designers.
Only two vendors in the hardware market, Cadence and Tharas Systems, currently use custom-processor architectures. Historically, such systems tend to have greater capacities than FPGA-based systems, though today's larger FPGAs are beginning to close that gap. Custom architectures generally offer slower runtimes than FPGA-based systems. But they're highly scalable in capacity and offer much faster and easier compilation.
"If you want to take the FPGA prototyping approach, it's going to affect your design, because you need to design up front for partitioning," says Ran Avinun, product marketing director for Cadence's Incisive verification platform. "If you don't partition up front, and most traditional ASIC developers won't want to do this, you get to the point where you have a database that is a combination of multiple RTL blocks and models and, as much as possible, you want to map this automatically to the hardware."
"We push people to look at their designs and understand up front what it's going to take to verify it," says Duaine Pryor, principal engineer and architect in Mentor Graphics' Emulation Division. "If you understand up front that verification will require booting an OS or running a second of real-time instructions, then you'll set up your verification methodology so that things will proceed smoothly into acceleration. And you'll get the maximum benefits out of acceleration."
Rather than manually mapping the design to the FPGAs, it's better to use commercial mapping software, such as Synplicity's Certify, which will perform automatic mapping. Certify includes automatic I/O-pin multiplexing so FPGA pins can be shared, circumventing the common problem of running out of I/Os. It also provides various mechanisms for debug-logic insertion.
Some emulation vendors offer proprietary mapping software that's specifically geared to their systems' architecture. EVE has spent a year developing an integrated compiler that's built to take advantage of its ZeBu-XL system architecture.
A tough issue facing vendors and users of FPGA-based systems is how to get more of the testbench into the hardware. "You still had a dependency on the event-driven simulator to pick up the testbench," says David Rinehart, director of marketing at Aldec. "Customers are demanding at least a tenfold performance gain over their current systems."
Verisity advocates an event-based architecture for emulation. "Traditionally, emulation systems have used cycle-based algorithms," says Steven Wang, senior vice president and general manager of Verisity's platform division. "A cycle-based type of algorithm preschedules all the different types of sequences all the way through. On every cycle, it goes through the same sequence whether there are events or not. The problem with that is that the simulation world is an event-based one. If you're not using event-based emulation, there's a lot of mismatches between simulation and the emulation world."
Verisity's approach to its hardware platform is predicated on its FPGA-based reconfigurable computing architecture. Coupled with an event-based algorithm, that architecture allows expansion into support for higher levels of abstraction within the hardware box. "We have plans to put a lot of the behavioral components of the testbench inside the box," says Wang. These include checkers, bus-functional models and assertions, and other items. Dedicated processors programmed into the emulator's FPGAs handle behavioral components.
Also in the works is a dedicated processor for handling constructs associated with Verisity's e verification language, as are processors for SystemVerilog constructs. "By 2005, we'll have some constructs of SystemVerilog mapped," says Wang. The effort will begin with the designer subset of SystemVerilog, followed by SystemVerilog Assertions.
According to Aldec's Rinehart, some designers gain that tenfold boost in simulation performance by using either C or SystemC models for their testbench. "That allows us to use a transaction-level model of the testbench and run it against the design itself," Rinehart says.
Aldec's Riviera simulator supports mixed simulation of Verilog, VHDL, and SystemC by virtue of a direct kernel connection between the HDL compilers and the C/C++ compiler. Engineers can have intellectual-property cores in either Verilog or VHDL and their testbench in SystemC. They can then load synthesizable RTL into the hardware embedded simulator (HES) board in Aldec's Riviera-IPT desktop FPGA-based system and execute the SystemC testbench against their design in hardware (Fig. 2).
Moreover, Aldec's Riviera-IPT system lends itself to hardware/software coverification. In this case, an off-the-shelf ARM CM-720T or -920T Integrator board is connected to the HES accelerator as a daughterboard. The combination is used with ARM's RealView debugging software. "Designers use our board to accelerate the processor in their RTL simulation," says Rinehart.
Many users of hardware-assisted verification are looking for both software simulation and emulation, while some go back and forth between the two modes. Between them is the gray area of the synthesizable testbench. The more of the design and testbench that can be synthesized, the better. A design with a synthesizable testbench will run from 100 to 10,000 times faster in hardware than a design with a behavioral testbench.
Unfortunately for most designers, using a synthesizable testbench means a methodology change that most don't want to make. "It's more difficult to write an efficient testbench for verification that is also synthesizable," says Cadence's Ran Avinun. "It's also more difficult to reuse such a testbench."
"The problem with synthesizing is debugging," says Verisity's Wang. "How do you preserve the ability to debug your design? It's the same problem with behavioral, because if you synthesize that down into some lower-level construct, it's hard to preserve the debugging."
This is why, in some quarters, the emphasis is on getting as much of the behavioral testbench as possible into the acceleration hardware. But unless the testbench is of a sufficiently high level of abstraction, the effort can suffer from slowed communication with the simulator kernel. Mentor Graphics tries to maintain a high level of abstraction in the testbench itself through use of SystemC or other C variants. The goal is to keep communication between the design in the emulator and the segment still running on the workstation at that same high level of abstraction.
Mentor's VStation TBX acceleration system is intended as a single environment that spans from simulation to emulation. It's based on the SystemVerilog DPI standard, which means that users' investments in their testbenches, whether behavioral, behavioral with C functions, or transaction-based, will be protected. The system's behavioral compiler recognizes SystemVerilog DPI calls and automatically generates all of the infrastructure needed to communicate between the workstation and the emulator (Fig. 3).
On top of that, the behavioral compiler allows designers to write transactors or testbench components in behavioral Verilog, which are then compiled into the emulator and run at emulation speed. As a result, VStation TBX permits the running of true system-level verification suites, such as an embedded operating-system boot or application startup. There's no need to maintain separate testbenches for simulation and emulation.
A significant difference between FPGA-based hardware systems and custom-architecture systems is in debug facilities. Tharas Systems' Hammer 100 accelerator, a custom-architecture system, offers 100% visibility into all signals on the design being simulated at all times. There's no need for recompilation while visibility is gained for the entire duration of the test. There's also no reconstruction of signals based on intermediate nodes.
A similar capability is present in Cadence's recent Incisive Palladium II acceleration/emulation system. A feature called FullVision provides the ability to view any individual node in a design without specifying probes. In addition, designs don't require recompilation to add probes, which can be an issue with FPGA-based platforms.
The Palladium II platform also exemplifies the compilation advantages inherent in custom-processor architectures. While some FPGA architectures may require dozens of workstations to compile a design, compilation on Palladium II can be accomplished with a single workstation without manual intervention. Clock-tree analysis for avoidance of timing violations is performed automatically.
Tharas Systems' Hammer 100 accelerator also offers the inherent compilation advantages of a custom architecture. It can compile a 15-Mgate design in under an hour on a single workstation.
Custom architectures can take great advantage of transaction-based acceleration. In Cadence's Palladium II platform, designs are partitioned into two different models. The behavioral portion runs on the workstation in C or SystemC while bus-functional models, which require more hardware resources, are mapped into the accelerator. Thus, most of the activity is on the hardware side of the equation. The same testbench and transactors can be reused in simulation and acceleration.
Tharas Systems' Rich Curtin, vice president of marketing, sees a significant upswing among Hammer users in adopting transaction-based verification. "We've seen transactions used more in networking applications, which involve lots of data passed back and forth between the stimulus generator and the design under test (DUT) mapped into the accelerator. But as customers begin writing complex and reusable testbenches, we'll see TBV getting into the mainstream," says Curtin.
Use of transactions is by no means limited to custom architectures. EVE's ZeBu-XL, an FPGA-based platform, features a testbench-to-DUT interface that's optimized to support transactions. "We even do that when you apply the testbench on a cycle basis," says Lauro Rizzatti, vice president of marketing and general manager of EVE-USA. In fact, when used with hardware, transactions can even replace cumbersome and difficult-to-reuse in-circuit emulation test beds (see "Transactions Will Melt Away Yesterday's ICE," p. 52).
Assertions are also becoming an important component of hardware-assisted verification. They allow designers to put built-in triggers into their code. Failures in assertions can tell verification engineers which block is the problem and enable them to quickly isolate the bugs. In Cadence's Palladium II, assertions are ported directly to the hardware, where they can be run at full speed. Users can either stop their verification session when an assertion failure occurs or continue while recording the failure.
"We're bringing more methodologies used in simulation tools into the hardware-based environment," says Cadence's Avinun. "Assertions is one of those new methodologies being used today mainly in simulation, but we think it will start to be deployed more in acceleration and emulation in the future."
|NEED MORE INFORMATION?|