Pushing performance up to 5 GHz to meet the requirements of the 802.11a wireless local-area network (LAN) specifications, designers from Atheros Communications, Sunnyvale, Calif., and Stanford University, Stanford, Calif., crafted a single-chip CMOS transceiver. Paper 5.4 describes the RF portion of the chip set, while Paper 7.2 presents the combo baseband and media-access controller chip.
In the receive chain, the 1-GHz IF provides a 2-GHz frequency separation between the incoming RF signal and its image. This allows narrow-band on-chip tuning elements to achieve -23-dBc suppression of the 3-GHz image and eliminates the need for an explicit off-chip IF image-reject filter.
In the transmit chain, the most challenging design aspect was to get sufficient gain and linear power output for the orthogonal frequency division multiplexing signal, which contains 52 subcarriers. The answer is a three-stage, class-A fully differential power amplifier and on-chip inductors that form parallel resonances with the gate capacitances of the output transistors.
The baseband controller described in Paper 7.2 delivers up to 54 Mbits/s in 20-MHz channels, as required by the 802.11a standard. Also included is a proprietary mode that permits data rates of 108 Mbits/s in 40-MHz channels.
Although 54 Mbits/s is pushing the performance curve using commercial CMOS, researchers at NEC Corp., Otsu, Japan, are looking into the future for wireless connectivity. In Paper 17.7, they show a set of monolithic microwave ICs (MMICs) that use a 60-GHz carrier and transmit and receive data at 1.125 Gbits/s (Gigabit Ethernet). The MMICs are based on 0.15-µm AlGaAs/InGaAs heterojunction FETs.
RF front-end circuits are another hot topic in Session 24 (Papers 24.1 and 24.2). The Katholieke Universiteit Leuven, Heverlee, Belgium, and Valence Semiconductor Inc., Irvine, Calif., both detail designs of GPS receiver chips. Valence's chip consumes about half the area of the chip designed by the university—just 8 mm2—yet handles quadrature filtering to reject out-of-band signals by more than 60 dB. This lets designers use just one differential path after the filter and saves power and chip area.
As communication frequencies go up, so do the speeds of CPUs and data interfaces. Papers detailing advances in high-performance computing, concentrated in Sessions 20 and 25, show improvements in architecture that yield gigahertz and faster clock speeds and high levels of integration.
For example, designers from Intel Corp., Shrewsbury, Mass., show how they improved the Alpha architecture to triple the performance over the previous-generation CPU in Paper 20.1. They implemented the instruction set in a 0.125-µm partially depleted SOI process and increased cache size to 3 Mbytes, almost double that of previous Alpha CPUs. To keep many instructions in flight at any point in time, the CPU employs an eight-way superscalar, four-way multithreaded architecture.
In the same session, Intel Corp., Santa Clara, Calif., and Hewlett-Packard Co., Ft. Collins, Col., unwrap the second-generation Itanium architecture (Paper 20.6). Operating at 1 GHz, the processor (code named McKinley) includes six one-cycle integer units, six two-cycle multimedia units, a 20-ported 128-word by 65-bit register file, a pair of 16-kbyte L1 caches, and 256 kbytes of quad-ported L2 cache. Dual 82-bit floating-point multiplier-accumulators and a 14-port 128-word by 82-bit register file are included too. A 3-Mbyte L3 cache reduces off-chip memory accesses.
In Paper 20.3, designers at Sun Microsystems, Palo Alto, Calif., show off their latest enhancements to the UltraSparc III architecture with a top operating speed of over 1 GHz. Improvements include the integration of larger caches—a 64-kbyte L1 data cache, a 32-kbyte L1 instruction cache, a 1-Mbyte L2 cache, and a 16-byte double-data-rate SDRAM interface. A new 200-MHz cache-coherent system bus interface enables symmetrical multiprocessing.
One of the more novel papers, 20.2, from the Korea Advanced Institute of Science and Technology, Taejon, details a single-chip programmable platform based on a multithreaded RISC processor and configurable logic clusters based on FPGA-like structures. The custom 32-bit processor handles up to 15 threads, each supported by 16 general-purpose registers, a stack pointer, a link register, and a program counter.
High-performance CPUs require faster essential building blocks, such as ALUs, register files, and other subsections. Paper 25.1, one of five from Intel in this session, shows off some key building blocks. It describes a 5-GHz dynamic ALU and instruction scheduler. Some of the other Intel papers show off a 5-GHz, 32-bit instruction execution core, a high-bandwidth 256-kbyte L2 cache for the Itanium, and a six-issue integer-integer datapath register file for the Itanium.
Of the two remaining papers in Session 20, the one by Broadcom Corp., Santa Clara, Calif., highlights a dual-issue floating-point coprocessor with fast 3D functions. The other, prepared by Fujitsu Laboratories of Sunnyvale, Calif., describes a 34-word by 64-bit self-timed register file with 10 read and six write ports and an access time of just 1.4 ns.
All of the high-performance CPUs are supported by high-performance memory chips and building blocks, such as the cache arrays and register files. Just two sessions are devoted to memory developments. Session 9 examines DRAM and ferroelectric storage, while Session 6 focuses on nonvolatile storage.
Session 9 discusses several future-looking memory structures, including a one-transistor gain cell fabricated in SOI by Toshiba Corp., Tokyo. The cell is smaller, less complex, and more scalable than existing DRAM cells, and it doesn't resort to any new materials or device structures. Dubbed the floating body cell, it consists of a MOS transistor with its body floating. One candidate would be transistors formed in partially depleted SOI layers.
Of the three ferroelectric memory papers, paper 9.5 by Sony Corp., Tokyo, describes a novel quasi-matrix memory structure that limits disturb issues and offers better packing density. Taking the crown for the densest ferroelectric memory, Samsung Electronics Co. Ltd., Kyunggi-Do, Korea, details a 32-Mbit nonvolatile RAM that delivers SRAM-like speed.
In the nonvolatile memory session, a novel 512-Mbit flash memory based on a two-bit/cell storage element is unveiled by a trio of companies in Paper 6.1. These include Saifun Semiconductors Ltd. and Ingentix Ltd., both in Netanya, Israel, and Infineon Technologies, Dusseldorf, Germany. The NROM cell employs localized charge trapping and consists of an n-channel MOSFET that uses an oxide-nitride-oxide (ONO) stack for the insulator under the gate electrode.
By replacing the standard oxide-only layer with the ONO stack, the cell can be programmed using channel hot-electron injection. Within a cell, memory data is stored in two separate narrow (less than 200 Å) charge distributions in the nitride above the junction edges. Thus, each cell contains two physically separated bits, allowing double the density without the complex multilevel bit storage of previous flash approaches.
Papers 6.3 and 6.4 show the battle of 1-Gbit NAND flash memories between Samsung and Toshiba, respectively. Samsung shows how it applies shallow trench isolation and 0.12-µm design rules for a 1-Gbit NAND flash memory with a 1-bit/cell architecture. Toshiba, in conjunction with Sandisk Corp., Sunnyvale, Calif., describes a 2-bit/cell 1-Gbit NAND flash that has a very high-speed programming throughput—10 Mbytes/s versus 7 Mbytes/s for the Samsung chip. To achieve the high-speed operation, both Toshiba and Sandisk incorporate a novel cache program, cache read, and on-chip page-copy capability.