For the EDA industry, 2008 was dominated not so much by technology breakthroughs but rather by corporate intrigue. Cadence Design Systems’ attempt to take over Mentor Graphics had the electronics industry at large holding its collective breath for a few months. In the wake of the effort’s failure, one need only look at the subsequent ouster of most of Cadence’s senior management, including CEO Mike Fister, to know how profound the failure was.
SOME POSITIVE RESULTS
But, of course, not all mergers and acquisitions in EDA are as lacking in synergy as a Cadence-Mentor merger would have been. Consider Mentor’s acquisition of Sierra Design Automation in the summer of 2007. Mentor already had leading-edge design-formanufacturability (DFM) technology. Sierra brought process-aware place-androute technology. The result was 2008’s most impressive EDA product: Mentor’s Olympus-SoC place-and-route system.
Timing analysis and optimization chew up some 70% of place-and-route runtime. At process nodes below 45 nm, that 70% of runtime will be getting very, very long. There are just too many design corners in today’s large systems-on-a-chip (SoCs).
Olympus-SoC addresses the bottlenecks that bloat runtimes on two key fronts. For one, multicore architectures have supplanted clock boosts as a way of increasing computational power in CPUs. So, Olympus-SoC is in position to take full advantage of multicore processors. But over and above multithreading of the timing analysis and optimization engines in and of themselves, Mentor has managed true parallelization of these engines.
Place-and-route engines have been coarsely parallelized before. But in Olympus-SoC, Mentor has achieved a feat that has long eluded the EDA industry. The massive interdependencies between timing analysis and optimization make parallelizing these processes extremely difficult.
Timing analysis is a sequential animal. The delay calculations for any given node depend heavily on the results for the nodes before and after it in the signal path. There is also a high risk of race conditions in the timing engine caused by attempting to synchronize threads with out-of-sync data.
So how has Mentor tackled the problems of performing fine-grained analysis to eliminate the dependencies found in typical timing engines? How does it recombine the multiple processing threads without introducing errors? The secret lies in what Mentor terms task-oriented parallelism, which has several primary elements.
All of the steps in the Olympus-SoC flow chart are built for multi-corner/ multi-mode (MCMM) optimization (see the figure). Additionally, all are multithreaded for multiple CPU cores.
The first major element of task-oriented parallelism is node-level data flow analysis, in which the software examines all nodes in the design and identifies the tasks associated with each of them. This breaks the tasks into heterogeneous chunks that have no interdependencies.
Once that is accomplished, parallelism is made easier because the synchronization issues are resolved. Further, all of the CPU cores on the host machine are utilized much more efficiently.
A second major element is adaptive task decomposition, during which the tool dynamically determines the best strategy for each step in optimization. This step is undertaken whether it’s an incremental update or not, and it can be ultra-fine grained.
So what does all of this mean for designers in terms of runtime improvements? For analysis, Mentor’s beta users have seen speedups of as much as seven times on an eight-core CPU. On the optimization side, design closure times are up to four times faster.
To take advantage of the task-oriented parallelism capabilities that have been added to Olympus-SoC, users must already have the basic package and purchase a $180,000 add-on. But the time saved is likely worth the price.