Deftly optimizing ASIC critical paths, this tool rides atop existing cell-based flows to improve timing while leaving physical design largely undisturbed.
Timing closure for ASIC design has always been difficult to achieve. But as IC process geometries fall below the nanometer threshold, the shortfalls in timing are growing larger. Timing closure tools that resize and buffer critical paths don't make up the difference. Traditional techniques can lead to months of fruitlessly chasing a timing goal. What's required is a new approach to timing closure, one that goes beyond the currently fashionable melding of logic synthesis and physical design as exemplified by physical synthesis.
Missing from that equation is optimization at the transistor level. In its first product, ZenTime, Zenasis Technologies has devised a tool that uses what it terms hybrid optimization technology. The tool operates simultaneously at the transistor, gate, and physical levels to achieve a fundamental improvement in timing. It does so while permitting the design team to use its existing cell-based design flow and without imposing a power, area, or signal-integrity penalty.
ZenTime starts with a synthesized gate-level netlist, with or without placement information, and the design constraints in the form of a PrimeTime SDC file.
First, the tool analyzes the design to identify critical timing regions and paths. It does this via a built-in static timing analysis (STA) engine designed to emulate Synopsys' PrimeTime static timing analyzer. In fact, timing correlation between PrimeTime and ZenTime's STA engine has been found to be high. "The timing engine walks through critical regions of the design, where it finds small cell groups that, if improved, overall timing of all critical paths would improve," says Debashis Bhattacharya, Zenasis' chief technology officer.
The tool identifies key clusters of gate-level logic that can be optimized at the transistor level to achieve the desired timing goals. These clusters are ranked and prioritized by the STA engine.
Once clusters are identified, ZenTime creates new cells by remapping the clusters' function at the transistor level. These new cells, called ZenCells, are custom-crafted in the context of the design and on the fly.
ZenCells are faster than the clusters they replace for several reasons, Bhattacharya notes. "In some cases, completely new functions are created that cannot be found in any fixed-cell library," he explains. They incorporate context-specific stack ordering and custom transistor sizing. They also benefit from transistor topology exploration that the tool performs before settling on an optimal inter-cell topology.
A comparison of a sample path from a customer design before and after optimization shows how ZenCells can break timing bottlenecks in critical paths (Fig. 1). The comparison also shows how new functions are created in ZenCells. In the optimized critical path, the multiplexer and NAND gate were combined into ZenCell 2 with a resultant delay of less than the multiplexer alone. Additionally, Bhattacharya notes, "No library we've seen to date has this function as a standard cell."
Through its insertion of context-specific crafted ZenCells, ZenTime can wield dramatic impact on global timing. In another customer example, a block of 30 kgates saw an overall tightening of slack distribution while gaining some 70 MHz of performance, from an initial frequency of 562 MHz to a final frequency of 630 MHz (Fig. 2).
INSIDE A ZENCELL
Within ZenCells themselves is where ZenTime performs hybrid optimization, gaining speed without sacrificing area or power. In yet another example of a customer design created using Design Compiler, a particular four-cell gate cluster originally consisted of 22 transistors and three logic levels.
After optimization, the resultant single ZenCell contained just 13 transistors and two logic levels. Rise time along the critical path within the cell dropped from 0.26 ns to 0.12 ns. Fall time fell from 0.31 ns to 0.10 ns. The number of nets within the cell dropped from nine to seven.
Internally, ZenTime's optimization engines look at many possible implementations for a cell cluster, which can be on single or multiple logic levels. "It chooses a partitioning of the function based on what gives the best results after transistor-level implementation," says Bhattacharya. "It's not constrained by what exists in the user's standard-cell library but rather is free to partition the function in whatever way gives the best results. It chooses a new transistor topology and sizes the transistors correctly for that topology."
The process of ZenCell creation is context-driven. In some cases, ZenTime combines existing cells to create new functions and in others it modifies existing cells from a library. Transistors are sized in context, unlike fixed-cell library functions. The tool can use the proper mix of continuous and discrete transistor sizes to optimize timing. "Once you know the best topology, you can gain more performance by sizing the cells to fit their actual context exactly," says Bhattacharya. Often, wires are eliminated in the process, along with place-and-route variables that can cause problems.
ZenCells are created in a fashion that Zenasis terms "placement-aware." Within the tool are physical estimators that account for routing estimates. Physical placement drives all optimization—cell clustering, buffering, and transistor sizing. ZenTime's built-in static timing analyzer can be directed to automatically optimize all critical paths in a design, or it can perform targeted optimization at the user's discretion. "ZenCells don't look any different to downstream place-and-route tools," says Bhattacharya. "They look the same and have the same architecture. Downstream tools can't distinguish ZenCells from the original library cells."
SPANNING THE FLOW
ZenTime is applicable at various points in a cell-based ASIC design flow, but it is not intended to replace any of the traditional tools in such a flow. It can be implemented following synthesis to optimize small blocks using wire estimates. In such usage, DesignWare library functions can see performance boosts of up to 20%.
It also can be used after placement on million-gate blocks by importing placement and post-route timing annotations. "It'll internally do route estimates and create ZenCells that are bigger than what you'd find in a standard-cell library, but done carefully so as not to create design issues," says Bhattacharya. When placement is imported, ZenTime uses blockage-aware routing estimates to ensure timing accuracy.
Finally, the tool can be used after clock-tree synthesis (CTS), when it won't disturb sequential elements or the clock tree itself. Yet it will help tighten up negative-slack paths inserted during CTS, improving overall performance.
Zenasis also offers an automated cell layout and characterization flow as a service called ZenCell Factory. To do so, it employs a best-in-class suite of tools from leading EDA vendors. Using ZenCell Factory can enable design teams without a cell-based flow in place to take advantage of ZenCells.
ZenTime is now in deployment at beta sites. It will be released for customer shipments in July with term-based licenses priced at $195,000 per year.