As programmers craft each new computer-game generation, and CAD software and other business applications demand faster responses, graphics chip manufacturers are bolstering chip performance to deliver more realistic images and faster drawing speeds. Higher throughputs and the ability to generate more realistic images will allow designers to more quickly create applications that immerse users with more life-like images and characters. Both ATI Technologies and NVIDIA have just unveiled their next-generation graphics CPUs. These bring designers a step closer to generating realistic, real-time graphics.
The GeForce2GTS from NVIDIA offloads the graphics-intensive computations from the host CPU in the PC architecture. This was accomplished by second-generation transform and lighting engines that designers embedded on the chip, along with a full high-definition video processor (Fig. 1). That enables extremely high-polygon-count scenes to be rendered at speeds of greater than 25 million polygons/s.
In addition, the chip includes a shading rasterizer that brings natural material properties like smoke, clouds, water, cloth, and plastic to life. This is performed by per-pixel shading capabilities from a pixel processing engine. Known as the gigatexel shader, this engine can perform seven simultaneous single-pass pixel operations each cycle. That allows it to deliver 1.6 Gtexels/s, or about three times that of the company's previous-generation GeForce processor. (1 Gtexel is 1 billion filtered, textured pixels.) To take advantage of the per-pixel shading, the software drivers must leverage Microsoft's direct X version 7 application programming interface (API).
Integrated on the chip too is the logic required for producing high-definition video images to play back DVD and HDTV signals on the computer monitor. High-performance hardware antialiasing logic helps remove jagged edges. At the same time, support for AGP 4X/2X host interfaces, AGP texturing, and its fast-write capability allows the chip to transfer image data very efficiently. Also, a 32-bit Z/Stencil buffer helps eliminate "polygon popping," an image instability that's typically found in high-polygon-count scenes. These scenes are employed in games and high-resolution 3D applications.
Specialized On-Chip Blocks
The next-generation Rage graphics processor from ATI Technologies, the Radeon 256, also delivers high throughputs. This is thanks to some specialized on-chip blocks that the company refers to as the Charisma Engine and the Pixel Tapestry architecture (Fig. 2). Aimed at rendering smooth 3D animated characters, the Charisma engine performs fast and flexible hardware transformations, clipping operations, and lighting. It will be able to deliver a peak performance of 30 million triangles/s, as well as support up to eight local or infinite light sources at usable rates.
Also included as part of the engine, a full-featured character-animation accelerator promises to breath new life into 3D characters. The accelerator provides an advanced video-skinning capability that allows characters to move, bend, and flex more naturally than in previous systems. A hardware-accelerated keyframe interpolator reduces the overhead for such operations as character morphing and expression changes.
Therefore, the Charisma Engine reduces system overheads by performing lighting, transformations, vertex skinning, clipping, keyframe automation, perspective computation, and viewport transformation. Many of these tasks were previously handled by the host CPU in systems using the company's previous-generation Rage 128 graphics processor.
Complementing the Charisma Engine is the Pixel Tapestry architecture, which is optimized for high-resolution 32-bit color performance. It allows the chip to deliver better than 60-frame/s images with resolutions of 1024- by 768-pixels at a depth of 32 bits/pixel.
That allows the rendering pipelines to deliver a peak throughput of 1.5 Gtexels/s with the HyperZ buffer enabled. The images made possible by this high data rate can be 3D environments with lush detail and photorealistic textures. The combination of the Charisma Engine and the Pixel Tapestry architecture thus permits images that accurately model the reflective properties of materials. It also lets 3D characters show fine detail when viewed up close and cast realistic shadows from every light source.
To accomplish such performance, the architecture employs three texture units in each of the two rendering pipelines on the chip. At full speed, the rendering pipelines can blend and filter up to three textures per pixel. Previous chips could typically only handle one or two textures. Therefore, those chips couldn't deliver as realistic an image as the new Rage processor with a single pass through the rendering pipeline. The graphics performance would be slowed down by special effects requiring multiple passes through the pipeline to implement multiple textures.
Like the NVIDIA GeForce 2GTS, the ATI chip includes an all-format HDTV decoding capability. It supports all ATSC resolutions, including the 1080i mode. Plus, the circuit includes YPrPb support for direct drive of HDTV displays. Adaptive de-interlacing is included on the chip too. This allows the controller to produce crisp, sharp video images, without blurring or artifacts. Further, the graphics processor has an 8-bit video/graphics alpha-blending capability. The I/O signals are fully compatible with the company's Rage Theatre companion processor.
Flat-Panel Interfaces Included
Both the Radeon processor and the GeForce2 GTS also include a digital-flat-panel interface to drive high-resolution color flat panels. NVIDIA designers integrated a TMDS transmitter on the chip, eliminating costly off-chip drivers.
To achieve the high bandwidth needed for the memory subsystem, ATI designers opted for the interface to use 200-MHz double-data-rate (DDR) synchronous DRAMs. Initially, though, the chip can use existing slower DDR SDRAMs. Up to 128 Mbytes of off-chip memory can be addressed by the graphics processor. With a 128-bit-wide interface, the processor has a memory bandwidth of 4.6 Gbytes/s. Additionally, the company developed a novel acceleration technology called HyperZ. This accelerates memory data transfers through the use of lossless compression and on-chip caching. Overall, the HyperZ technology boosts the effective memory bandwidth more than 20% faster than the basic DDR SDRAM data rate.
The NVIDIA GeForce2 GTS architecture also supports up to 128 Mbytes of DDR SDRAM for use as the frame-buffer memory. Memory bandwidths of over 5 Gbytes/s are made possible with the DDR 128-bit-wide interface.
Initially, the company expects to see three board-level memory configurations. These are a 32- and a 64-Mbyte AGP implementation, and a 32-Mbyte PCI version. Although the board-level manufacturers will set the price of boards based on the NVIDIA chip, the company estimates that a 32-Mbyte board might cost about $299. The chip consumes about 10 W. AGP cards built with the chips will fall well within the AGP power guidelines for graphics cards.
ATI claims that power consumption for its Radeon processor will be between 5 and 7 W. Boards based on its chip should fall well within PCI and AGP guidelines too. Prices at the board level should be comparable to those based on the NVIDIA chip.
Price & Availability
Both ATI and NVIDIA estimate that the base-level graphics cards, which are implemented around their respective controller chips and have 32 Mbytes of DDR SDRAM, will list for between $250 and $300 each. Samples of the ATI Radeon processor will be available this summer, while samples of the GeForce2 GTS are available right now.
ATI Technologies Inc., 33 Commerce Valley Dr. East, Thornhill, Ontario, Canada L3T 7N6; (905) 882-2600; www.ati.com.
NVIDIA Corp., 3535 Monroe St., Santa Clara, CA 95051; (408) 615-2500; www.nvidia.com.