In a perfect world, fabrication of silicon ICs would be a perfectly predictable process. Not only would every chip be absolutely identical, but there would be no variations from wafer to wafer, or lot to lot. In such a paradise, all chips would meet their predicted design parameters. They would all run at the designers' intended speed, no faster and no slower. All would meet their timing specifications. There would be no clock skew, no IR-drop surprises, and happiest of all, no need whatsoever for pessimistic design approaches.
But we don't live in that perfect world. Trains and planes don't run on time. New cars almost never get the mileage claimed by their makers. And silicon fabrication processes vary, sometimes wildly, and in ways that are maddeningly unpredictable. Circuits can vary from predicted physical values in a number of ways, ultimately affecting the transistors themselves, the wires that interconnect them, or both.
Designers have faced the variability of fabrication processes since day one, and by various means, manage to get around it. Primarily, it's through static timing analysis. But a new generation of static timing analysis is upon us, one that uses statistical techniques to overcome the issues inherent in traditional static techniques. In this report, we'll look at where static analysis has been and where it must go to cope with the complexities of nanometer silicon technologies.
Traditional static timing analysis (STA) is, and has been, the method that virtually every digital design team uses to achieve timing signoff. In STA, you must have a timing model for every cell in your design (or at least the signal paths you care about). The analyzer uses those timing models to calculate delays for all cells in a given path. Those delays, in turn, are summed to determine total path delays.
Process variability comes into play here. With the move downward in process geometries, the variability in silicon or, more precisely, the ability to account for it becomes the priority in maintaining the designers' intended performance.
"If you look at SiO2; (silicon-dioxide) thicknesses, for example, we're talking about 14 atoms or so in today's high-end processes," says Leon Stok, director of the electronic design business for IBM's Systems and Technology group. "If you're off by one or two atoms, you're suddenly off by 10% or 20%. Before, this wasn't an issue. We think we're seeing the limits of some of the physical phenomena we tend to deal with."
You may intend for your design to run at, say, 500 MHz. But with the various process variability factors involved, even if we assumed that all of the chips were functional, not all of them will run at your target speed. Some may run at 400 MHz, some at 450 MHz, and even some at 550 MHz.
This is why "corner-based" analysis has been the mainstay for many years. The essence of corner-based analysis is to determine the best and worst cases for various parameters, such as ambient temperatures, supply voltage, and many others. Each of these parameters is referred to as a "corner."
While corner-based analysis continues to be indispensable now and into the foreseeable future, it does have several disadvantages. For one thing, it's slow. At nanometer geometries, the number of corners is exploding. At larger geometries, designers could get away with analyzing worst-case parameters for just a handful of corners. Today, designers find themselves analyzing 64 or more corners over a full range of process variation. That translates into a huge runtime burden.
And that's just the inter-die, or die-to-die, variation. There's also on-chip variation (OCV) to consider. "OCV effectively adds some pessimism to the design through the analysis to cover a variation that could happen between, say, the clock routes between devices that are spread around the chip," says Robert Jones, senior director of Magma Design Automation's Silicon Signoff Unit.
Consequently, between interdie and intradie variations, corner-based analysis is quickly becoming a millstone around designers' necks. Yes, it's slow and cumbersome. But perhaps even worse, it compels design from a pessimistic standpoint.
When designers are forced to consider all of the worst-case corners they've analyzed, they suddenly find out their analysis predicts that some of their 500-MHz chips may only run at 350 MHz. Thus, to optimize the yield that will run at 500 MHz, they'll compensate by overdesigning.
"Corner-based design is perceived as leaving a lot of quality on the table," says Andrew Kahng, co-founder and CTO of Blaze DFM. "People are very worried about the return on investment (ROI) of the next generation of silicon technology. As guardbanding increases, obviously you're harvesting less of the potential ROI of that process improvement. At some point, if this isn't better managed, the ROI will just not be there."
Enter statistical static timing analysis (SSTA). While traditional static timing analysis can supply a worst-case number for delays, it can't provide a sense of the distribution of performance versus yield (Fig. 1). Rather than simply determining best-and worst-case corners and attempting to arrive at a single value for delays, statistical timing analyzers propagate probability distributions. Among the inputs to SSTA tools are distributions of parameters, such as transistor channel lengths. The distribution of values represents how channel lengths can actually vary based on silicon data.
Because they consider probability distributions, SSTA tools accept information about variation and then simultaneously consider the different probabilities of single transistors being at different points in that variation space. "It can do an analysis. It gives you more information about the likelihood of meeting timing, essentially your parametric yield," says Bill Mullen, group director of R&D at Synopsys.
In addition to device variation, there's also interconnect variation. "SSTA tools can take information about the individual wires and relate that to the variation in the parameters for the metal at the different metal layers," says Mullen.
The goal of SSTA is to reduce the sensitivity to variability in global attributes, such as temperature and voltage. Analysis is performed on each type of variability to arrive at a probability density function, or PDF (Fig. 1, again) . This function represents a statistical look at how the device will operate across variations in the underlying parameter.
Ultimately, an SSTA tool combines these individual PDFs with those for all of the underlying parameters to achieve an overall distribution for a given node in the circuit (Fig. 2).
"Statistical timing is nothing but adding probability distributions and taking the maximum of them to find the new arrival point of a signal at a gate," says Mustafa Celik, CEO of Extreme Design Automation. "This is one way of doing SSTA. Another is instead of propagating distributions, you can propagate parametrized representations of the arrivals and delays."
SSTA: WHO AND WHY?
Now that we've defined SSTA, the next questions are who uses it today and why. There's no doubt that SSTA is a leading-edge technology. There are certainly designs at 90 nm that can benefit from the application of SSTA. But many industry experts feel that SSTA won't see widespread adoption until the 65-nm node is prevalent, or even until 45 nm gets out of R&D and into circulation.
"You need SSTA less at different silicon geometries," says Eric Filseth, VP of product marketing for digital IC design at Cadence. "At 130 nm, most designs don't vary enough to get huge value out of statistical methods. Our belief is that you'll probably need it at 45 nm. It's clear that people can do 65-nm chips without SSTA."
Regardless of the node at which SSTA sees broad adoption, usage models for it are beginning to take shape. One of the gating factors toward adoption is availability of process parameters. As a result, statistical methods have seen their earliest usage from integrated device manufacturers (IDMs) like IBM and Intel. In such cases, a single part might dominate an entire fab line. An Intel or IBM knows that it can sell any microprocessor it makes at some price. Therefore, it uses bin sorting of parts by speed, and SSTA is applied in an attempt to slide the distribution of speeds as much toward the high side as possible.
A fabless semiconductor house might also make use of SSTA, but it would do so for different reasons. Intel can bin-sort its Pentium chips, but a fabless house doesn't necessarily have that luxury. For many fabless houses, either the chip runs at rated speed with the proper amount of power consumption or it doesn't. In the latter case, it's deemed a failure and can't meet the application's needs. But the fabless house still wants to maximize the number of sellable chips per wafer.
"That's not necessarily the same as pushing the target frequency as fast as possible," says Blaze DFM's Andrew Kahng, "because you might have leakage power-limited yield loss. So statistical design still applies even to those who do not bin or bin crudely. For example, if a graphics company has a chip that can't be sold into the mobile space, maybe it can still be sold into the desktop space. So graphics-chip companies, as well as processor companies, have that flexibility.
Clearly, the IDMs have a distinct advantage in applying statistical methods to timing closure. "For example, an IDM has control of the process and private access to the foundry," says Kahng. "So the path that the statistics, statistical device models, model-to-silicon correlation studies, etc., must go through is at least internal."
For the fabless world, SSTA's adoption will depend on the availability of process data and tools with the ability to consume it.
"Most major foundries have long begun forming strategic partnerships that will have statistics traveling back and forth before too long," says Kahng. "One thing is that the tools need to be able to consume the statistics before there's any point to releasing them."
The transition to SSTA has begun, but it will most likely take the form of an evolution. Most see traditional STA and SSTA as complementary.
"People will continue using, wherever possible, deterministic techniques," says Ravij Maheshwary, senior director of marketing for signoff and power products in Synopsys' Implementation Group. While it doesn't currently offer statistical capabilities, Synopsys intends to evolve its existing timing closure tools—PrimeTime, PrimeTime SI, and Star-RCXT—in a statistical direction.
"There will be a transitional period where we'll use the deterministic STA and we'll use statistical analysis to handle the sensitivity checking," says Magma's Robert Jones. "So we can begin to eliminate some of that sensitivity and get to designs that are more reliable, passing silicon yield on every wafer run."
TOOLS BEGIN APPEARING
Suppose you were interested in exploring adoption of statistical static timing analysis in your flow. Where can you get it? As of this writing, only one vendor, Extreme Design Automation, markets a commercially available standalone SSTA tool. One RTL-to-GDSII tool vendor, Magma Design Automation, offers it in the context of its implementation flow. And one IDM, IBM Corp., provides access to SSTA technology through its design-services operations.
Extreme Design Automation refers to its technology as "variation-aware IC design." According to Extreme's Celik, the company wants to fill the "design-to-manufacturing gap" with its XT statistical timing signoff tool (Fig. 3).
Initially, Extreme sees XT as overlapping or coexisting with Synopsys' PrimeTime.
"In a way, statistical timing will check whether PrimeTime's analysis is correct or not," says Celik. "It will check whether corners are valid or find others that PrimeTime may miss. Or, it can check whether the margins and derating factors used in PrimeTime are safe or pessimistic."
Extreme's XT is a block-based tool. It also can handle path-based analysis. The tool can account for correlation in variations due to reconvergent paths. Perhaps most importantly, XT includes a patented sensitivity-analysis technology that calculates the sensitivities of delays, arrivals, slacks, design slack, and parametric yield with respect to design parameters (cell sizes, wire sizes, and wire spacing), system parameters (VDD and temperature), and process parameters.
Using the results of sensitivity analysis, XT performs optimizations and engineering change orders (ECOs), such as resizing of cells. The physical information embodied in the ECOs is fed back into an incremental place-and-route tool. There, the ECOs are implemented as post-layout optimizations that improve parametric yield (Fig. 4).
XT also includes a library-characterization module. "Because we need delay information for delay tables, we characterize and put that information in a modified .lib format," says Celik.
Once the library is characterized, it can be reused for a given-process node or technology. The libraries are additionally parametrized, so if a given process matures or drifts in terms of its characteristics, the library never requires recharacterization. Information on the process change can be fed to the timer to compensate.
Celik claims XT's extractor as the industry's first statistical, or variation-aware, extractor. The pattern-based extractor parametrizes R and C values with respect to process parameters. Parametrized extraction makes it straightforward for the tool to handle manufacturing effects like density-based thickness variations or spacing-based variations. Mean error for the extractor is less than 1% with a sigma of less than 2%.
Finally, Extreme's XT is built for capacity and runtime. It can analyze 5 million instances overnight on a single 32-bit CPU. It also can perform full-chip Monte Carlo simulation to calibrate the statistical timer. That simulation can be distributed to a farm of Linux machines.
Extreme's XT tool is the only current option for those interested in a standalone SSTA tool. However, Magma's customers have another option—the company's Quartz SSTA, which adds variation analysis and optimization to the Magma IC-implementation flow. Quartz SSTA is a path-based tool that offers high accuracy for complex circuit topologies. The tool performs block-based optimization, though, bringing an incremental-analysis capability to the table.
For path-based analysis, Quartz SSTA takes advantage of sophisticated filtering algorithms that allow it to locate the most sensitive devices. This boils down to a kind of criticality analysis, which can be very useful in determining the paths that will most seriously impact overall circuit delays.
"We've defined a criticality factor that helps us comprehend both the magnitude—the height of the distribution— where it is relative to the slack value, as well as the sigma, or the standard deviation or width of those distribution curves," says Magma's Robert Jones.
The tool also brings flexibility in terms of library characterization. Users can employ their own pre-characterized libraries and develop derating factors that allow them to begin analysis without slogging through heavy-duty characterization.