Rail-Signoff Analysis Ensures SoC Power Integrity

More than ever, power integrity is vital in the successful creation of today's system-on-a-chip (SoC) designs. That's because e xcessive rail voltage drop ( IR drop) and ground bounce can create timing problems. Also, excessive current can cause electromigration and related thermal effects, leading to chip failures.

The first steps designers must take to prevent these problems are solid power-network planning and implementation. The next step is a good rail-signoff analysis flow to ensure that all power-related issues are resolved. To avoid timing problems and device failure, designers need to analyze an SoC's entire power network to ensure that it provides adequate power integrity.

Obtaining accurate rail analysis requires a good methodology and practical guidelines that expedite the flow. These guidelines include practices such as screening library exchange format (LEF) and design exchange format (DEF) files , creating white-box representations to speed analysis, obtaining toggle-rate information for power analysis, and using electromigration plots to identify IR-drop issues.

While dynamic IR-drop effects also should be considered due to the smaller margins of sub-130-nm processes in the overall SoC power closure, this article provides an overview of a suggested rail-signoff analysis flow for static IR-drop analysis.

Rail-Signoff Overview IR drop occurs due to the resistive nature of the power routing from the power pads to the cell instances in the design. IR drop for a given instance depends on the current delivered by the power network, which should enable cells to operate at their targeted frequency in that area of the design. Considering these two factors, the IR drop across the power network will vary across the design. With the transition to smaller process geometries, IR drop is even more critical due to increased wire resistance and self-heating introduced by interconnect technology scaling.

Another problem, electromigration, may occur for several reasons associated with the current flowing through the power rails. Excessive current density over a long period of time, as well as the high power requirements of high-frequency designs, can lead to electromigration unless care is taken in designing the power network. Electromigration analysis is very important since, if left undetected, electromigration can cause performance degradation and chip malfunction over time in products already shipped. Like IR drop, electromigration is a growing problem in finer process geometries.

The suggested signoff flow analyzes IR drop and electromigration in large SoCs (Fig. 1). It was proven effective in a service project for five complex HDTV SoCs. The primary goal was to keep the flow implementation simple for ease of use across multiple designs and processes. A comprehensive pushbutton flow was not developed, as we considered it is unrealistic due to the many complexities of today's designs and libraries.

The flow requires only standard DEF and LEF for physical design and library input. Because the flow is standalone for rail signoff, it works with many third-party place-and-route tools. To allow for fast analysis spins late in the design cycle, the flow setup has two parts: reference library preparation and design library preparation (Fig. 2).

With this setup approach, you create all reference libraries only once in an SoC project. Then, you can execute fast iterations on the design-library creation and subsequent analysis each time you make design changes. Note that some library vendors provide Liberty files for only min and max corners. However, it's best to have min, nom, and max corners available so you can analyze more operating-condition (PVT) combinations.

Running The Rail Signoff Flow To run the first part of the suggested flow, an average power analysis, you should set variables for V_DD at a maximum, worst-case (max) synthesis library and worst-case (c_worst or max_c) signal net parasitics. The flow chart in Figure 3 shows the suggested steps in this part of the flow.

When using the flow's clock-domain-based toggle method, you can set all signal nets in the target domain to toggle once every other clock, for an equivalent 50% toggle rate. Some tools, such as the rail analysis tool Astro-Rail, can define a 100-MHz clock at the top level and automatically propagate it through the design hierarchy.

This way, all signal nets in the design take on appropriate toggle rates based on statistical estimations. For example, a simple 20% toggle rate ?per unit time? for this 100-MHz clock can be computed using the following equation (in which a toggle consists of two edges per 10-ns clock period):

TR = (0.20 × 2)/10.0 ns = 0.04

The analysis tool estimates power consumption for hard macros based on the toggle rates of their input ports. But you can improve accuracy by gathering power information from the macro data sheets and annotating it on the hard macros.

After power/ground net extraction (Fig. 4) come the IR-drop and electromigration analyses that are the heart of rail analysis (Fig. 5). Experience has shown that two different sets of variables provide the best scenarios for these two analyses. While the IR-drop analysis uses worst-case timing conditions (worst parasitics, slow synthesis library, high temperature, and nominal V_DD), the electromigration analysis requires a hybrid condition (worst parasitics, fast synthesis library, high temperature, highest V_DD ). To determine the worst-case impact on device performance due to collapsing supply rails, it's useful to modify the recommended IR-drop operating conditions. Use the worst-case, or lowest, V_DD .

The rail analysis tool can generate plots that show analysis results. PrimeTime- SI can use the instance-based voltage data to determine the delay effects on critical timing paths. Note that your library vendor must support the timing analysis using either k-factors for a non-linear delay model (NLDM) library or voltage-drop-aware scalable polynomial delay model (SPDM), or composite current source (CCS) libraries.

Power Integrity Signoff Best Practices When executing a rail signoff analysis such as the one just described, you encounter a wide variety of real-world complications?and sometimes not until late in the flow. The following suggestions will help smooth the way to a timely signoff.

Screen technology and library files : One of the first steps in the signoff flow is library preparation, and the necessary input files can differ from the ideal in many ways. For this reason, you can save lots of time by screening the vendor-supplied technology file (.tf) and library LEF used to create the standard-cell Milkyway libraries.

First, check the minimum via enclosure and spacing rules defined in the Milkyway technology (.tf) file. In many cases, you need to modify these rules so via arrays in the design DEF properly map to Milkyway contact arrays. Otherwise, opens will occur in the power/ground network. You can use a perl script to modify the minimum grid parameters in the .tf file and ensure the proper minimum enclosure rules.

Second, check the LEF files and verify that each power pad has a ?USE POWER ;? attribute defined in each power PIN section. Also, each ground pad should have a ?USE GROUND ;? attribute defined in the PIN section. The Milkyway database requires these statements to define the pad cells as power or ground pads for Astro-Rail.

Third, check the .tf file for information related to electromigration analysis. Search for the keyword ?maxCurrDensity? to find the maximum current-density values for metals and vias. If the file doesn't contain these values, you can define them in the run configuration file (labeled as ?userFlowVars.scm? in Figure 2) or manually inside Astro-Rail. The library vendor's cell library documentation or user's guide should contain these values.

Next, check the .tf file for temperature coefficient figures. The file should contain one ?temperatureCoeff? keyword/value pair for each metal layer and each via (or contact code, or ?cut?) definition. If the file doesn't contain these values, add the appropriate values as specified in the foundry's design-rule manual and update the library and design technology databases.

Finally, some standard-cell LEF files include information such as metal resistance and via resistance, but this information doesn't always match the presumed ?golden? values in the .tf file. Discrepancies can make the analysis inaccurate. An easy way to keep the LEF information from overwriting the .tf information is to have the flow automatically replace the technology file with the properly modified one, just after the vendor's library LEF is read.

Run initial power analyses in ?virtual routing? mode: Take advantage of Astro-Rail's ability to analyze a design lacking detailed routing information. The power analysis can go forward with only initial routing estimates.

Using the ?virtual routing? mode for extraction, Astro-Rail completes a quick virtual route prior to running the power analysis. When using this option in design projects, running in ?virtual routing? mode produces power estimates and IR-drop results within 2% to 5% of the results obtained by analyzing an equivalent detail-routed design. So, the ?virtual routing? mode is accurate enough to make early decisions on the power grid while reducing the overall run time.

Start with black-box models, sign off with white-box models: White-box models for hard macros contain only the minimum necessary power and ground network information. But their use can greatly increase chip-level analysis run times for large SoCs if the macro has significant metal slotting or metal fill tracks tied to one or more of the supply rails. In these cases, you can save time by running analyses with hard macros instantiated as black boxes until you execute the final signoff runs.

Check for electromigration violations first: Finding the cause of a large IR drop on the power rings can be difficult if you only use an IR-drop analysis, so electromigration plots may be a better place to start. You can fix electromigration violations near the power/ground pads and power rings relatively easily. These fixes tend to clean up the IR-drop map, making it easier to evaluate.

The electromigration analysis also can identify places where the power/ground mesh is insufficient or is missing a connection. A hot or warm spot results from wires that are too thin or spaced too far apart and via arrays that have too few vias. It's important to use the initial electromigration analyses to find general weaknesses in the power grid before focusing on detailed electromigration violations.

You can find and fix electromigration violations near the power/ground pads and power rings relatively easily. Such violations can seriously impact the IR drop. Fixing these violations first cleans up the IR-drop analysis, making it easier to read and evaluate. To find subtle grid weaknesses , it's useful to ?stress? the design by lowering the electromigration violation thresholds or even elevating the analysis voltage above V_DD (max).

Start analysis early : Rough statistical toggle-rate estimates supply a useful starting point for developing and evaluating the power grid. Being a bit pessimistic at the early stages helps reveal power-grid weaknesses that would create problems later in the flow. Because changes to the power grid become more difficult after detail-routing, you can save overall time by running early ?what if? power analysis.

If you haven't yet added the clock tree to the netlist, be sure to account for its power consumption. The clock tree can consume a surprisingly large portion of a design's dynamic power, especially in designs using 90-nm and smaller technologies. This power consumption depends on the design architecture, the operating conditions, and the process technology.

Working with rough estimates for toggle rates and using either of the flow's toggle methods, Astro-Rail can scale the estimated power to match a specified power value. As an alternative, spreadsheet estimates usually offer acceptable accuracy because you can use information such as energy-per-transition and a mapped gate-level netlist to calculate the power.

You can use the spreadsheet power estimates in the same way as the rough toggle-rate estimates, or, you can have the rail analysis tool propagate the toggle rates through the logic cones. In fact, running the flow using clock-domain-based switching activity invokes the latter method of power estimation. The statistical propagation feature (using PowerCompiler technology) analyzes the combinatorial logic functionality to determine the toggle rates for each net. It can achieve more realistic power estimates even if your toggle-rate estimates are somewhat pessimistic.

Undoubtedly, the best suggestion for both power-integrity planning and analysis is to start as early as possible. By starting early and integrating these methodologies into the overall design flow, you avoid many problems that become difficult to correct later in the flow. The practices described here can give you a head start on implementing a rail analysis flow that's essential to ensure the performance of today's SoCs.