Designers of mobile phone handsets are under pressure to deliver high-performance multimedia solutions that give the user access to new and emerging capabilities. These include video conferencing, movie recording and playback, high-end gaming, high-resolution digital camera functionality and mobile television.
The processing required to handle these functions is beyond the resources provided by the baseband processor responsible for the basic phone functionality. As a result, handset designers are turning to a design model in which there is a 'standard' baseband processor and separate devices that address the various requirements for multimedia processing. However, thanks to a combination of limited space, power consumption constraints, and cost pressures, the ultimate design goal is for all multimedia processing to be handled by a single, compact, low-power SoC device. There is also pressure to reduce time-to-market so that the time allowed to develop this integrated 'multimedia engine' is minimal.
Semiconductor manufacturers are creating stand-alone silicon platforms that deliver multimedia processing functionality for a digital camera, audio, video, and high-resolution graphics. Toshiba's MobileTurbo series of codec LSIs, for example, provide the MPEG-4 encoding and decoding functionality needed to deliver video-game-grade 3D graphics to mobile phones.
This family provides scalable solutions to meet different multimedia criteria in terms of raw processing power, support for different screen resolutions, and other functions like still-image and audio functionality. Recent additions to this family of LSIs, for example, support fast rendering of graphics at speeds of 125 million pixels per second. Features include handling of advanced shading, texture mapping, special effects,nd other visualisation functions.
As the example LSI in Figure 1 shows, these multimedia devices are based on a multiple-processor architecture. Each processor has a dedicated multimedia function such as MPEG-4 encoding and decoding, 3D graphics processing, JPEG encoding and decoding, or MP3/AAC decoding. The device in the diagram has five onboard processors, each of which executes firmware downloaded from local memory. This allows various features to be added based on an application's specific requirements.
An LCD controller provides an interface to a QVGA display. External memory is replaced by embedded DRAM, which can transfer data to the graphic controller at a rate of 2Gbps, facilitating game-console quality video output with low power consumption. When operating at a clock speed of 125MHz, power consumption is as low as 170mW.
Next-generation SoCs need even higher levels of processing power, especially when it comes to implementing solutions that support mobile television. The snag is that increased processing power typically means higher power consumption. This presents a challenge for the designer, because the handset's battery must last long enough on a single charge to provide a reasonable viewing time. New techniques are needed to raise performance levels while driving down device power consumption during operation and standby.
For this reason, Toshiba has developed a new power control system that reduces IC power consumption by dynamically controlling voltage supply and operating frequency for SoC devices. Referred to as the partial frequency/voltage regulator, the technology works by optimising the operating frequency and the power supply voltage at a modular—rather than at the IC—level.
In effect, it allows the frequency and voltage supply for each individual circuit block within the SoC device to be controlled independently. This means that each of these individual blocks can be turned on and off, as needed, by the application. Tests conducted by Toshiba show that, using the partial frequency/voltage regulator technology, power savings of up to 40% are possible.
Partial frequency/voltage regulation only addresses one part of the problem, though. As power conservation becomes more important, the impact of standby leakage current becomes more significant. Furthermore, it is not uncommon for 20% or more of a chip's power budget to be consumed by leakage alone, with transistors in each new CMOS process generation being leakier than those in previous generations. This figure must be reduced dramatically if designers are to deliver viable mobile multimedia solutions.
Multi-threshold CMOS (MTCMOS) technology is one solution to delivering the necessary reductions in standby leakage current. MTCMOS is a technique that reduces leakage current during idle modes by providing a high-threshold sleep transistor in series with the low-threshold circuit transistors as shown in Figure 2a. In active mode, the high-Vth transistor is turned on, while in sleep mode it is turned off, providing a small sub-threshold leakage current.
The latest evolution of this technology, selective MTCMOS, allows the selective implementation of MT cells during design. Specifically, critical paths are identified in which MT cells are used, while high-Vth cells are used in non-critical paths (Figure 2b). This selective MTCMOS technology eliminates the virtual ground line of conventional MTCMOS, which means that circuit speed is not affected by the discharge pattern of other gates. In addition, mode switching between active and standby can be performed in a single clock cycle, while power consumption during mode switching is lower than with conventional MTCMOS.
Toshiba has developed a prototype LSI device capable of providing the processing functionality and low power consumption needed to make mobile television a reality. Known as the T5V and fabricated using the company's 90nm 6M CMOS technology, the single-chip H.264/MPEG-4 audiovisual LSI for mobile applications—including terrestrial digital broadcasting systems such as ISDB-T and DVB-H—combines the company's Media embedded Processor (MeP) architecture with the partial frequency/voltage regulation and selective MTCMOS power-saving techniques outlined above. (The MeP architecture is based on a hierarchical bus structure that allows the construction of a heterogeneous multiprocessor system in which multiple MeP modules and shared memory are connected to a global memory bus.)
The ability to decode data rapidly, based on H.264 and MPEG-4 standards, is the key to delivering television via mobile handsets. Compared to previous video standards, H.264 requires high programmability, making it difficult to implement in dedicated hardware accelerators. In the new LSI, close cooperation between the processor core and these accelerators has reduced operating time and power consumption without losing the programmability necessary for H.264.
Toshiba's LSI can decode CIF (352x288) H.264 baseline profile at level 2, and encode VGA (640x480) MPEG-4 SP @L4a video stream at 30fps—while simultaneously encoding and decoding audio and speech streams, and multiplexing and demultiplexing them at 180MHz. The chip contains four major modules: a video front end, a video back end, an audio/speech module, and a multiplexer/demultiplexer. Each of the modules consists of an optimally configured 32-bit MeP core and hardware accelerators.
As for voltage and frequency, the new LSI slows down the audio module independently from the other circuit blocks on the rest of the chip. By using this software-controlled, voltage and frequency scaling system for the audio module, operating resources can be closely controlled by the specific requirements of the live applications. An on-chip voltage regulator and voltage and frequency selector provide the voltage and frequency control in response to requests from the code running on the IC and, as a result, the power consumption for the audio element that decodes MPEG-4 AAC drops by 40%.
SOC TURNAROUND TIMES
Companies are developing techniques to speed up SoC turnaround to accommodate shortened time-to-market cycles. The most recent development in this area is a platform known as Universal Array.
All cell-based ICs must undergo a rigorous verification and testing process prior to production. Conventionally, once the design process reaches tape out—the point where EDA tools can be applied to production of engineering samples of ICs—the diffusion wafer (DW) that integrates the basic IC components is fabricated. The wafer then undergoes personalisation (personalised wafer, PW) to complete the manufacturing process. The new Universal Array, which is initially targeted at 130nm and 90nm CMOS process technologies, accelerates this process time by allowing fabrication of the DW at the same time as the implementation and timing verification processes. Figure 3 shows how Universal Array can reduce turn-around-time (TAT) when compared with conventional SoC developments. The key to delivering functionality, such as mobile television and advanced gaming for next-generation handsets lies in the development of SoCs that offload multimedia processing from the main processor. Technologies such as MeP allow these SoCs to deliver the requisite performance levels, while partial voltage/frequency regulation and selective MTCMOS techniques minimise power consumption during operation and standby. Finally, new processes such as Universal Array allow SoCs to be developed within the tight timescales demanded by handset design.