It is amazing and counterintuitive but nonetheless true that production engineers can create semiconductor devices whose internal complexity far exceeds the still-anemic ability of their brethren to functionally verify in simulation. Even more provocatively, this skewed reality will exist as long as photolithographic technology continues to outpace the capability of design tools. In other words, we're stuck for as far as we can reliably see. The issue, of course, is that a digital circuit with a billion transistors can easily find itself in a state of confusion, unable to march forward from an unanticipated vector that scrambles its brains like one of Asimov's nanites, just with a lot less fuss. And gigascale circuits inevitably contain oodles of undiscovered, unexplored, and uncovered vectors.
What to do?
One approach is the popular simulate-the-heck-out-of-the-design methodology, where otherwise perfectly intelligent product managers invest in computer farms comprised of thousands of processors, each of which is concentrated on a small piece of the new electronic design. With luck, after weeks of preparation to determine block assignments, and days of letting the simulation run, they might get a few thousand reliable clock ticks, hardly enough to get through the boot-up sequence. That's the consequence of the exponential growth in transistor density: Even inside a bunch of multi-gigahertz cores it still takes forever, literally, to make sure that nothing bad, or at least not unrecoverable, will happen in the real world.
The obvious advantage of behavioral simulation is ubiquitous coverage and infinite flexibility. That's great if you already know where the problems are. But if you are trying to verify a design, especially one integrated with new IP, third-party IP, or retargeted IP, there are simply too many places to look. Sure, the verification tools can implement museum-quality mathematical statements of how things ought to be, including hundreds of thousands of states that differentiate good behavior from bad, but the gallery can be filled with legumes as easily as Picassos. Moreover, a device in the field is going to see hundreds of trillions of values, not hundreds of thousands, representing a coverage chasm nine orders of magnitude wide. That's a whole bunch of peanuts.
Enter emulation and "hardware accelerators." Again, in the absence of sensible alternatives, and once the manager has a modicum of confidence that the design is "getting close" based on simulated results, lots of complicated circuits find themselves incubated inside of expensive hardware-acceleration systems. These systems build a sort of virtual reality that dramatically increases the dynamic range of the exercise stress test but still suffers from serious impediments. First and foremost, the most complicated designs often don't even fit inside of the programmable gate arrays that comprise the heart chambers of hardware emulation. Second, tool support is typically better suited to fertilize the aforementioned vegetation. Designs that are perfectly synthesizable—never mind ordinarily compact—can become unwieldy, or even impossible to implement. Finally, there are the prosaic problems of power, performance, and price. So, are we predicting the death of emulation-based verification? No, but like simulation, it is far from a perfect solution.
Rather than spending months in simulation (or millions of dollars on emulation), device manufacturers are increasingly "biting the bullet" and taping out much earlier in the design flow than prudence, or permission, would have recently allowed. This is not a defeatist approach to design implementation. Rather, the best among us are pragmatically facing the reality of raw competition, where tens of millions of dollars are at risk by delayed, or occasionally missed, market entry. In this environment, design purists have lost their ideological sway to the democracy of the market, where suddenly a re-spin (or five) is no longer an emblem of shame or concession. It is just good business.
Proactive plans for silicon validation and debug suggest device instrumentation strategies that offer simulator-like observability to facilitate discovery and diagnosis. Harvesting the benefit of early tapeout only makes sense if you can actually see the results of the many states enabled by at-speed, in-system operation. Moreover, application software can be co-developed, even with imperfect silicon, especially if the hardware platform is flexible enough to accommodate changes.
Conventional access methods based on physical effects like e-beams, or corrective FIBs, are slow and rely on expensive hardware and cumbersome procedures. Recognizing this, many manufacturers are already placing fixed instruments inside of their devices to aid in analysis and debug. These approaches have proven to be moderately successful but suffer the burden of manual insertion, inability to perform "what-if" analyses, and inadequate post-silicon software support. We've seen examples where the debug hardware accumulates like junk DNA in our bodies, because designers are reluctant to pull blocks that close on timing, even if they don't exactly know why they are there!
The industry has identified "reconfigurable infrastructure fabrics" as an especially promising approach. The best of these are automatically inserted at RTL, are design-flow compatible, use standard cells, maintain the package size, and, most importantly, have excellent post-silicon software support. By approximately replicating the pre-silicon design environment, design and debug engineers avoid the need to learn new tools (or religions) and can seamlessly compare device behavior to simulations—where they have simulations—or can apply assertions that insure proper function or detect misbehavior.
Reconfigurable infrastructure fabrics can perform a variety of validation and in-field services. In addition to the classic logic- and transaction-analysis, small reconfigurable blocks can be used for soft repairs, performance monitoring, power management, and even secure computing. A variety of architectures have been proposed, and the first are finding early adoption and commercial traction.
Now, instead of enduring the chronic fatigue of the Sisyphean race to catch up with relentless advances in hardware implementation, new design paradigms leverage fabrication techniques and enables manufacturers to "be inside" their devices with new cost-effective, time-effective, and market-responsive methodologies. The old adage is true: If you can't beat 'em, join 'em!