Master Of Many Trades: Designing An Emulator Calls On Diverse Skills

Jan. 6, 2011
Looking for a fully rounded EDA-user experience? Try designing an emulation system! EVE's Lauro Rizzatti explains how such a design endeavor calls for mastering of myriad facets of design at an extremely high level.

Four key aspects

The EDA industry has, over the decades, become a case study in specialization. Each of the different elements of system design has seen a complex evolution of technology specific to its problem. This complexity is partly masked by the large EDA players that sell many tools. Even those companies, however, are made up of discrete, highly specialized divisions. Smaller startups tend to concentrate on narrow problems, looking for success by focusing more intensely than incumbents.

But by drawing on widely varying disciplines for success, the emulation segment still tends to reflect the EDA industry as a whole. From a user’s standpoint, an emulator becomes a system-on-a-chip (SoC) design; the design is simply implemented on a different platform. For a user to have this experience, the emulator architect must draw from four critical areas (see the figure):

  • Hardware development of the type done by prototyping and FPGA companies
  • RTL compilation of the type done by EDA and FPGA companies
  • Run-time environment of the type created by software development and debugging companies
  • Transactor creation, which is basically an IP development process

Creating The Hardware
Building an FPGA platform, whether for prototyping or for emulation, magnifies the complexity of the normal FPGA hardware design process. For the development of a specific design, the architect can make calculated tradeoffs to achieve the gates, clocks, and I/Os required for its purpose.

Developing a general platform that can accommodate a wide range of designs with billions of gates and a multitude of clocks (many of them derived) and I/Os is much harder. The need for broad adaptability transforms the designer’s quest for the best FPGA into a multi-FPGA problem.

Prototype board companies face a similar challenge to emulation companies in developing a general platform, but have different performance/flexibility tradeoff requirements. A prototyping board targets higher performance, but typically with less emphasis on debugging and bring up time. 

Emulation systems, in contrast, need to be able to handle multiple revisions of a design with minimal hand-crafting by the user. These systems sacrifice some level of performance for the sake of flexibility and generality. Regardless, emulators must still ensure that sufficient signals, gates, clocks, I/Os, and routing options are available to handle a wide range of designs at a non-trivial performance point.

Getting A Design Onto The Hardware
For any given hardware platform, you need tools to implement a particular design. Of course, in a successful system, the tools and platform are developed in tandem. But the tools themselves represent an enormous investment. At EVE, the compiler constitutes the vast bulk of R&D expenditure. Why? Consider what is required.

Continue on next page

A successful compiler brings together a diverse set of technologies: synthesis, partitioning, timing analysis, clock mapping, and routing, just to stick to the main tasks. First, the RTL description of the design must be synthesized into a gate-level netlist. Since a primary requirement for an emulator is that design turns can be done quickly, synthesis efficiency can be sacrificed in the interest of getting a result more quickly. Because that is not the tradeoff typically made by synthesis tool vendors, it requires more original development work.

Once the design has been synthesized, the netlist must be partitioned across the array of FPGAs implementing the design. This partitioning technology is not widespread and depends heavily on how the FPGAs are interconnected. This tool, then, is often developed from scratch.

The need to map clocks efficiently raises an even greater challenge. Modern designs can use hundreds of thousands of derived clocks distributed over hundreds of FPGAs. Designers reduce power consumption by using complex clock-gating strategies. A disproportionate amount of effort goes into the compiler’s ability to manage these clocks well.

Getting all of this working is not only a matter of synthesis, but also of timing analysis. Just because the design fits doesn’t mean it works—or works fast enough. Custom timing analysis must be built into the compiler to provide a meaningful result.

Once all of the above is done, the FPGAs must be placed-and-routed. It is true that FPGA vendors have provided FPGA compilers for years. But the user of an emulator should not see a flow tied to a specific FPGA technology. The user should simply be looking at a “generic” set of logic resources. The details of which family of which vendor’s FPGA happen to provide that logic should be invisible. So it’s not enough to pass the FPGA vendor’s tools along to the user. The tools must be encapsulated in an environment that appears generic.

This compiler ends up requiring leading-edge synthesis, partitioning, timing analysis, clock mapping, and place-and-route technologies to be usable. It is no wonder that it takes so much attention from the R&D team.

Running The Hardware
The third critical area in emulator development is the run-time environment, which involves two different components. The first is intimately bound up with the operation of the design itself. It includes the software operating system and any real-time issues or virtualization layers that might be required to make the hardware and processing elements available to the applications that will ultimately be visible to the system user.

The second component exists outside the design itself, providing a kind of meta-view for tracking and debugging purposes. Not only must each bit of logic in the FPGAs handle its assigned task, it must also be made observable so that the user can be sure that the logic task is being performed correctly. The notion of observation implies not only simple access to the signal, but also access to the run-time environment that turns the observed data into meaningful information.

Continue on next page

At a higher level, execution must be controllable: the user should to be able to start, stop, go back, loop, and single-step––all the standard debugging moves. But with an emulator, the user may be managing billions of gates’ worth of functionality on hundreds of devices over stretches of millions (if not billions) of verification cycles. A more comprehensive debugger is unlikely to be found anywhere else.

Getting The Hardware To Talk To Software
Emulators can act as accelerators within a larger verification environment that includes a host running software on a virtual platform. While the hardware in the emulator can provide acceleration of functionality on a precise cycle-by-cycle basis, the virtual platform can accelerate software that doesn’t require that level of accuracy.

But the two have to talk to each other. Transactors placed on either side of the SCE-MI interface that connects the host to the emulator ensure that communication occurs quickly enough that it doesn’t cancel out the benefits of acceleration. A transactor is a piece of high-level verification IP that abstracts the details of a function into a set of transactions. Obviously, the specifics of each transactor—what transactions are available and how they operate—will differ by function. So, a practical emulation environment must make transactors available for a wide range of uses.

Verification IP transactors can be obtained “off the shelf.” Building them requires the kind of IP design and management skills that an IP company needs. But emulator users may also require custom transactors. This requirement necessitates the development of a tool for taking any abstract behavior and building a transactor for use between the host and emulator. Such a tool is unlikely to be found elsewhere.

To assemble a leading-edge emulation system, the designer draws from wildly disparate realms of technology—not as separate products on their own, but as contributors to a single emulation product. A big EDA player that has a very successful synthesis engine and a less successful timing analyzer can still be a successful company. But for an emulator to be competitive, both of those technologies, and all of the others noted above, have to perform. There is little room for error. If you don’t do well in all areas, then you don’t do well at all.

Lauro Rizzatti is general manager of EVE-USA. He has more than 30 years of experience in EDA and ATE, where he held responsibilities in top management, product marketing, technical marketing, and engineering. He can be reached at [email protected].

Sponsored Recommendations

Design AI / ML Applications the Easy Way

March 29, 2024
The AI engineering team provides an overview and project examples of the complete reference solutions based on RA MCUs that are designed for easy integration of AI/ML technology...

Ultra-low Power 48 MHz MCU with Renesas RISC-V CPU Core

March 29, 2024
The industrys first general purpose 32-bit RISC-V MCUs are built with an internally developed CPU core and let embedded system designers develop a wide range of power-conscious...

Asset Management Recognition Demo AI / ML Kit

March 29, 2024
See how to use the scalable Renesas AI Kits to evaluate and test the application examples and develop your own solutions using Reality AI Tools or other available ecosystem and...

RISC-V Unleashes Your Imagination

March 29, 2024
Learn how the R9A02G021 general-purpose MCU with a RISC-V CPU core is designed to address a broad spectrum of energy-efficient, mixed-signal applications.

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!