Designers Seek Silicon Architecture That Works Smarter, Not Harder

July 8, 2002

4 min read

Today's most challenging system design issues place conventional IC architectures in a dubious position. Since the microprocessor was invented, Moore's Law has been the bellwether for the semiconductor industry. It states that transistor density will double every 18 months, increasing clock speeds. This dictum holds true to the present and is predicted to continue.

However, with more transistors, system engineers are creating additional instructions, with the penalty of an ever-increasing inefficiency. The design engineering community is ready for a silicon architecture that works smarter, not harder, removing design obstacles and enabling higher performance, lower power consumption, lower risk, increased functionality, and faster time-to-market.

Because earlier technology prevented engineers from building large numbers of silicon gates on one chip, they designed systems with a computational center--the arithmetic logic unit (ALU) or multiply-accumulate (MAC)--that required only a few gates. Through the use of software instructions, these systems could do many things.

Today's architects have orders of magnitude more gates available than earlier designers. But this has led to increasing instruction set complexity, while not improving ALUs, MACs, or actual computational efficiency. Hence, the overall architecture efficiency drops as more gates are added to overhead. As we reach 100 million transistors on a chip, less than 1% of the given transistors will accomplish useful tasks.

New thinking about how to implement silicon is emerging. It follows Campbell's Corollary to Moore's Law: Efficiently using transistors dramatically increases performance and greatly reduces the number of transistors needed.

Campbell's Corollary is the cornerstone of the Adaptive Computing Machine (ACM), a new class of IC that offers high performance yet lower complexity, cost, power consumption, and risk compared to conventional, rigid IC technologies. The ACM is unique because it brings into existence the exact hardware implementation necessary for the software, for however long is required.

Once that specific hardware engine has completed its assignment, another hardware engine is brought into existence to handle the next task. This reduces instruction overhead and uses the transistors more efficiently. Because the ACM's architecture conforms to the task at hand, it increases performance by 10 to 100 times over conventional ICs with only 10% of the power consumption.

Computation power efficiency (CPE) is the ratio of the amount of gates actively working to solve a given problem divided by the total number of clock cycles it takes. This CPE metric provides an analysis of the ACM's efficiency compared to conventional microprocessor, DSP, and ASIC approaches. The CPE of a typical DSP or microprocessor is about 8% to 15% with the remaining gates as overhead that burns power. The CPE for an ASIC is higher, averaging about 20% to 30%, but at the expense of flexibility. Any changes not anticipated during the design cycle result in a time-consuming and costly re-spin of the ASIC.

On the other hand, the CPE for an ACM is around 65% to 70% because algorithms are no longer changed to fit the predefined hardware architecture. Instead, the optimum hardware for an algorithm comes into existence, accomplishes its task, then goes away. Due to the CPE's independence from silicon processing technology, an ACM is always more efficient.

For example, the QualComm Code Excited Linear Predictive (QCELP) speech compression algorithm has eight major "inner code loops" that account for the majority of the power consumption. The QCELP algorithm running on a DSP core uses around 84 mW in 0.25-µm CMOS and approximately 4 mm² for the embedded DSP and memories.

Moving the eight QCELP algorithms into ASIC cores takes only 19 mW in total, with 3 mW for the ASIC cores and 16 mW for the DSP. Here, the ASIC cores operate the eight inner code loops, and the DSP core runs the remaining QCELP code. But this design takes 20 mm² of silicon.

Replacing the ASIC cores with an ACM significantly conserves power, while using only 5 mm² of silicon area. The eight power-consuming QCELP routines are ported into the less power-hungry ACM engine, which also expends just 3 mW. The system engineer thus transfers 68 mW of power out of the DSP operation and replaces it with 3 mW, while only using 5 mm² of silicon instead of 20.

Adaptive computing technology enables designers to work smarter, not harder. Designers can rapidly respond to changing requirements during product development, while achieving faster time-to-market. If a standard is altered, a competitor releases a new feature, or a bug fix is needed, the inherent adaptability of the ACM enables immediate updates. Additional gains are achieved with less complex design tools, smaller design teams, and reduced overall risk.