High-definition audio is becoming pervasive thanks to increases in DSP power, the large data capacity of the Blu-ray disc format, and the ever-increasing speed of Internet connections. Many consumer and professional products now offer high-definition audio capability, and market pressures will compel most consumer electronics OEMs to incorporate high-definition audio in upcoming products.
For consumers, the advantages of high-definition audio are twofold: the greater fidelity of high bit rates and sampling rates, and the greater realism possible through surroundsound (see “What Is High-Definition Audio?”). For professionals, high-definition audio permits greater leeway in setting recording levels, more creative potential, and maximum flexibility in meeting the needs of various distribution formats.
Using a higher sampling rate of 88.2 or 96 kHz instead of CD’s 44.1 kHz extends high-frequency response. A higher sampling rate also allows the anti-aliasing filter in an analog-to-digital converter (ADC) to be raised to a frequency far above human hearing. Some audio engineers contend that the phase distortion caused by the 20-kHz anti-aliasing filter used in 44.1-kHz digital audio is audible at lower frequencies.
Word depths of 20 or 24 bits extend dynamic range past the maximum 96 dB of CD. Although a 96-dB dynamic range is theoretically adequate for consumer applications, it is seldom achieved in practice.
When a recording engineer has only 16 bits of resolution, the input level must be set conservatively to avoid overloading the recording device, often resulting in an effective dynamic range of perhaps 12 bits (72 dB). Alternatively, the engineer can use dynamic range compression to avoid overloads, but this reduces fidelity. A 24-bit system allows dynamic range of at least 20 bits (120 dB) with no risk of overloading the ADC and without resorting to compression.
The use of more than two channels of sound expands creative possibilities for recording professionals and increases the realism of audio for consumers. With a two-channel format, a listener must sit equidistant from the speakers to get a centered sonic image. A 5.1 or 7.1 format can deliver a consistent centre image for a roomful of listeners, no matter where they’re sitting.
Applications For High-Definition Audio
In the consumer market, high-definition audio’s most prominent application is in Blu-ray disc players, which can at least pass DTS-HD Master Audio and Dolby TrueHD lossless audio and uncompressed multichannel PCM through their HDMI outputs. Many of these players incorporate complete decoding of these formats for analog output. Most home-theatre-in-a-box (HTiB) systems that include an integrated Blu-ray player/receiver also decode these formats.
High-definition audio may be found in devices that play audio streamed or downloaded from the Internet, and a few Internet download sites now offer high-definition audio.
Continue to next page
An audio/video receiver or surroundsound processor often is used to decode high-definition audio. Many of these devices offer internal decoding of DTS-HD Master Audio and Dolby TrueHD. Most of these products offer full 7.1-channel output, and some offer technologies that expand a 5.1 or 7.1 signal to as many as 11.1 channels.
In the professional world, digital audio products that record or process 24/96 audio are now the norm. These may include digital mixing consoles, sound effects processors, and digital equalization and crossover processors for PA systems. Multichannel capability is also common.
High-definition audio signals are typically compressed for storage and transmission because they are so data-intensive. A 24/96 eight-channel signal may consume 12 times the data required for a 16/44.1 two-channel signal.
Two technologies, DTS-HD Master Audio and Dolby TrueHD, are currently used to compress multichannel high-definition audio for distribution on Blu-ray discs. Both technologies are lossless codecs, meaning they deliver bit-for-bit reproduction of an original master recording. DTS-HD Master Audio is capable of 24/192 resolution in two-channel mode and 24/96 resolution in up to eight (7.1) channels. Dolby TrueHD is capable of as many as 14 full-range channels in up to 24/192 resolution.
Some codecs used to distribute audio over the Internet also offer high-resolution capability. The FLAC codec achieves resolution up to 32/655. While it is commonly used to distribute two-channel content, it can also be used for 5.1. The Windows Media Audio Lossless and Apple Lossless codecs also have the technical capability to support 5.1 and 24/96, although most applications and devices using these codecs support only 16/44.1 two-channel audio.
DSPs For High-Definition Audio
High-definition audio is processor-intensive, so DSPs require high performance and specific features to provide the horsepower required. For example, the Sharc 2147x and 2148x series from Analog Devices include several features that free the core processor from having to perform simple tasks that can be better handled by separate, dedicated components (see the figure) within the DSP.
In the past, the task of decoding high-resolution audio formats used up most or all of the processing power of a single DSP. Any post-processing such as room equalization, volume management, or creating extra channels of sound would have to be performed in additional DSPs. With the Sharc 2148x and 2147x series processors, a single chip can handle both the high-definition audio decoding and practically all of the post-processing options available today.
For example, room equalization (EQ) technologies such as Audyssey’s MultEQ use many long finite impulse response (FIR) filters, which consume a great deal of processing power. The simplicity of these filters makes it possible to offload this task to a separate piece of silicon within the DSP.
Continue to next page
The Sharc processors include built-in accelerators that can perform most of the processing required for room EQ, speaker crossovers, and tonal adjustments, so the core processor can concentrate on more complex tasks such as high-definition audio decoding. In terms of multiply accumulates (MACs) per second, the accelerators roughly equal the speed of the core processor, doubling the overall performance of the system.
Processing also can be offloaded from the main DSP core in sample rate conversion. The onboard sample rate converter found in the Sharc 2148x and 2147x processors can be used to independently convert low sample rates such as 44.1 kHz up to higher rates like 96 kHz, or vice versa. It can be used for jitter reduction as well, which creates a more precise digital audio signal by removing the clock from the incoming signal and replacing it with an internally generated high-precision clock.
The sample rate converter comprises four independent two-channel sections, which can be combined to deliver as many as eight channels—all precisely timed with zero interchannel phase error. A built-in Sony-Philips Digital Interface Format (SPDIF) interface makes it easy for external devices to use these sample rate converters.
For high-definition audio, a DSP should also have as much onboard memory as possible. This reduces cost because it allows many memory-intensive functions, such as room EQ and reverb, to be performed without the need for external RAM. Extra memory also cuts programming time. Coding is simpler because the programmer doesn’t have to be so conscious of memory limitations.
Direct memory access (DMA) is another feature that can further lessen the load on the core processor by managing the DSP’s internal memory. Outside devices can access the internal memory straight through the DMA, without having to go through the DSP core. The DMA allows the core processor to receive data in blocks rather than in single samples, dramatically reducing the number of interrupts and increasing speed in the process.
Finally, a DSP that natively supports 32-bit floating-point arithmetic will simplify algorithm development, enabling engineers to focus on the audio aspects of the design rather than being distracted by arcane numerical issues that must be considered when using fixed-point integer arithmetic.
The standard 32-bit floating-point format sets aside 23 bits for the mantissa, eight for the exponent, and one for the sign bit. This format is sufficient for storing 24-bit high-precision audio samples, but a DSP with 40-bit precision is required when performing arithmetic operations.
Paul Beckmann is the founder and CEO of DSP Concepts. He holds BS, MS, and PhD degrees in electrical engineering from the Massachusetts Institute of Technology.