Emulation and Verification in the Evolving Chip Design Market
This video is part of the TechXchange: Addressing Chip Verification Challenges.
What you’ll learn:
- How Siemens is taking on emulation and verification from chip design to software development.
- What’s included in the Veloce CS family of prototyping tools?
- Why you need to emulate a 40+ billion transistor chip.
System-on-chip (SoC) designs continue to intensify in complexity. Large chips are possible now, and chiplets and multichip packaging technologies allow for even larger collections of transistors to squeeze into a single chip. Single chiplets can include billions of transistors.
Such large chip designs need to be emulated before silicon is generated to make sure the final result actually works properly. Almost all aspects of a chip, from signal timing to thermal designs, must be verified ahead of time. And software has to be tested before the design goes to fabrication.
I talked with Jean-Marie Brunet, VP and GM of Hardware-Assisted Verification at Siemens, about the trends in design and chip verification (watch video above). A second video drills down into the company’s new announcement about the family of Veloce CS hardware-assisted verification tools (watch video below).
Jean-Marie Brunet presents Siemens’s Veloce CS hardware-assisted verification tools.
The Veloce CS family actually consists of three members for emulation, enterprise prototyping, and software prototyping (Fig. 1). The Veloce Strato CS is designed for low-level emulation to verify the chip design. Veloce Primo CS is faster, but it doesn’t provide the highly accurate, low-level simulation. The Veloce proFPGA CS is the fastest in terms of software execution. However, it targets software development rather than chip certification.
The CrystalX chip is incorporated in the Veloce Strato CS emulator (Fig. 2). It was designed by Siemens from the ground up to deliver fast, accurate emulation. It includes advanced debug capabilities and is designed for scalability. A 256-blade Veloce Strato CS system can emulate a chip with over 40 billion gates, which is on par with the largest SoCs on the market today.
The AMD VP1902 Adaptive SoC is used in the other two Veloce CS platforms. The off-the-shelf, Versal FPGA is implemented with a chiplet technology using a two-by-two, super logic regions (SLR) layout. This provides great routability in the FPGA while reducing overall latency.
Siemens adopted a stackable, blade system to scale its solutions to handle very large chip designs (Fig. 3). The Veloce Straco CS series has a blade with CrystalX chips that can stack four high in a module incorporating a network switch. Sixteen of these stacks provides 256 blades capable of emulation SoCs with over 40 billion gates.
The Veloce Primo takes the same approach, but six stacks are suitable for handling a 40-billion-gate design. Emulated software execution is faster on this smaller system; however, it doesn’t have the details like that of the Veloce Strato CS with the CrystalX chips.
Although the Veloce Strato CS and Veloce Primo CS offer different levels of emulation accuracy, they share a common architecture from a development standpoint (Fig. 4). They use a common compiler and runtime software so that designs can be moved from one to another with ease. No recompilation is necessary.
The systems are air-cooled and designed to be energy efficient. The systems use about 10 kW/billion gates. A common software architecture provides improved total cost of ownership.
The Veloce proFPGA CS targets software development. This family starts with a single-FPGA desktop board. It can scale to a full rack that handles 4-billion-gate designs. The FPGA’s I/Os are all exposed, allowing for connection to physical hardware. The overall system design is optimized for the multi-FPGA configuration. The multi-FPGA configuration starts with a six-FPGA blade. Ten ProFPGA blades are found in a rack and three racks provide 180 FPGAs.
The Veloce operating system for prototyping (VPS) is designed to handle large chip designs. It can compile RTL without manual modifications, and it has automated multi-FPGA partitioning needed to manage those large chip designs. The software features timer-drive performance optimization and can debug at-speed. Systems are able to run at speeds in excess of 100 MHz.
Overall, the Veloce CS family provides significant improvements in size and speed while reducing power by a factor of two and doubling system density. The software and hardware support multi-user, heterogeneous workloads for more efficient system utilization.
Check out more videos and articles in the TechXchange: Addressing Chip Verification Challenges.