WCDMA Baseband Design Faces Challenges

Third-generation (3G) wireless access will soon make its way into products like portable game consoles. This evolution will enable users to play interactively with friends anywhere and anytime. Compared to previous-generation wireless systems, however, 3G raises multifold design obstacles. With dynamically changing parameters and closed-loop operation between the transmitter and receiver, 3G modems demand diligence in system design.

To help meet third-generation wireless standards and optimize performance, designers of 3G wireless-handset digital-baseband receivers must plan carefully. They must prepare to deal with the project's complexity and optimization requirements from the beginning. Meeting this goal requires a seamless design and verification flow. This flow must start with a high-level, floating-point representation for algorithm optimization. A refining of the implementation for hardware, software, and their interfaces will follow. By adhering to this methodology and taking advantage of available standards-based models, hardware and software developers can accelerate the design process.

This article describes some of the functions and requirements for a modem that complies with the frequency-division-duplex (FDD) UMTS wideband CDMA (WCDMA) wireless-system standard. This information stems from the work of a design team within Synopsys Professional Services. This group created a UMTS-FDD digital-baseband receiver and transmitter for use in 3G-handset applications. The modem was developed from high-level product requirements. It was implemented as part of a complete 3G-handset prototype that was successfully tested with commercial testers. Speech calls and video transmissions to a UMTS-FDD base station have been conducted successfully.

The modem contains a significant amount of control flow functionality. This characteristic facilitates dynamic Layer 1 (L1) configuration changes, as well as closed-loop operation between base transceiver stations and user equipment. These functions have high computational requirements. In addition, they must support user bit rates as high as 2 Mbps. A significant part of the hardware is thus dedicated to baseband-dataflow signal-processing tasks with a sophisticated interface to the L1 software.

Much of the design information presented here concerns the operation of the random-access channel. Among other things, this channel is used by a handset to initiate connections with a base station. Because this channel involves many of the modem's hardware units and software functions, it provides a good view of 3G design issues. Understanding the workings of this random-access channel requires an overview of the modem's main capabilities.

A frequency-division-duplex modem contains both a transmitter and receiver (FIG. 1). The latter component is by far the more complex of the two parts. It therefore occupies the bulk of this article. Yet it also is useful to have an overview of the transmitter.

Essentially, the transmitter's role is to get the data part of the physical channels to be transmitted as input from the channel encoder. It multiplexes the physical-channel data bits with the physical-channel control bits. It then spreads and scrambles the subsequent data stream. The result is a stream of chips that modulates the transmit filter. In this process, the transmitter must control the associated power amplifier for uplink open-loop or closed-loop power control. If necessary, it also must adjust the transmit timing. Note that part of the finite-state machine, which handles the system's random-access function, also is implemented in the transmitter hardware.

As for the receiver, sampled analog-to-digital-converter (ADC) data enters the digital front end. There, matched filtering takes place. The matched-filter output feeds the cell searcher, multipath searcher, and RAKE receiver.

The cell searcher performs initial cell acquisition and monitoring, including the determination of the appropriate scrambling code and frame timing. Based on the coarse timing provided by the cell searcher, the multipath searcher estimates the power delay profile. The output of both the cell and multipath searchers feed back to the L1 software.

L1 and higher-layer software run on a microprocessor core, which is referred to here as the CPU. The main tasks of the L1 software include evaluating and monitoring the power delay profile as provided by the multipath searcher. The L1 software also assigns RAKE fingers to received echoes. As indicated by higher-layer software, the L1 software dynamically configures RAKE and combiner physical-channel processing. This task includes variable-rate and compressed-mode transmission. Higher-layer software also prompts the L1 software to schedule and initiate measurement tasks.

The RAKE receiver comprises the following: RAKE fingers, global blocks, an automatic frequency control (AFC), a combiner, and a unit that generates power-control signal-to-interference ratio (SIR) estimates (FIG. 2). The RAKE-receiver fingers perform physical-channel demodulation. This process includes functions such as time tracking, frequency-offset estimation, channel estimation, and diversity decoding. The combiner performs time alignment (elastic buffering) and maximum ratio combining of the finger output symbol streams. The combiner output goes to the traffic channel decoder.

> RANDOM ACCESS The FDD modem's random-access channel (RACH) is an uplink transport channel. It is used for initiating a transmission on a dedicated channel, as well as for short packet transmission. After successful initial cell acquisition, the handset or user equipment (UE) reads a number of parameters from the broadcast channel of the acquired cell. If the UE wants to initiate a transmission on a dedicated channel, it first has to make itself known to the base station using the physical random-access procedure.

The UE cannot accurately predict the transmission power that is needed in order for its RACH transmission to be heard by the base station. As a result, the UE transmits so-called RACH preambles starting at low power. It increases the power level until it receives an acknowledgement from the base station. In the case of a positive acknowledgement, the UE transmits the RACH message part at the same power used for the last preamble transmission.

The base station sends an acknowledgement in the form of acquisition indicators (AIs). AIs are mapped to the acquisition indication channel (AICH) transmitted on the downlink. Each AI may be 0, 1, or −1, where a 1 corresponds to positive acknowledgement. A −1 corresponds to a negative acknowledgement, and a 0 corresponds to no acknowledgement.

The UE may only start the RACH transmission at a number of time offsets, which are called access slots. Similarly, the AI corresponding to a preamble transmission in a certain access slot is transmitted at a specific position within a certain AICH frame. The RACH/AICH timing relationship is critical. As a result, it must be thoroughly verified to ensure that all of the necessary processing can be performed within the given timing windows.

A RACH finite state machine implemented in Layer 1 software controls the overall UE behavior during the RACH procedure (FIG. 3). This state machine configures all involved units of the UE modem. The UE transmitter sends the RACH preambles and the RACH message part on the uplink. The base-station receiver has to detect the RACH preambles and generate the acquisition indicators. The base-station transmitter maps the AIs onto the AICH. To verify the UE's behavior, one can model the preamble detection and AI generation by reading AI values from a stimulus file.

The base-station transmitter transmits the AICH on the downlink, along with the other physical channels. In the UE, the RAKE receiver demodulates the AICH in the same way as any other physical channel. The AI detector (inside the RAKE receiver) must then detect the AI that corresponds to the access slot. It also needs to find the signature used for the uplink preamble transmission. L1 software reads the result, which goes to the RACH finite-state machine. On the UE side, the RACH procedure involves the baseband modem receiver and transmitter, as well as the L1 software.

SYSTEM-DESIGN CONCERNS Because the UE cannot initiate a call without a successful RACH transmission, thorough testing of the UE random-access behavior is important. Functional verification must be performed for all involved hardware and software units at all levels of hierarchy. It also is vital to perform hardware/software co-verification of the RAKE-receiver/AI-detector-interface-to-the-L1-software in conjunction with the L1-software-to-UE-transmitter interface. In addition, the performance of the UE-AICH/AI-detection must be verified. Inadequate performance could keep the UE from initiating a call at the edge of the coverage area. Because AICH/AI detection is a two-step process, its performance optimization involves many tasks. They are listed here:

Step 1: Optimize the AICH demodulation performance. This process is identical to optimizing the RAKE receiver performance, which in turn implies optimizing several receiver units and their algorithms. These include:

The multipath searcher, which depends on the following for its performance:

The searching algorithm (e.g., single versus double dwell and serial versus parallel search)
The sampling rate (number of samples per chip), which determines the resolution of the power-delay profile
The post-filtering of the instantaneous delay profile, which determines the signal-to-noise ratio (SNR)
The detection threshold for discarding noise-only echoes
The finger-assignment algorithm, especially in the case of closely spaced echoes

The channel estimation, which requires the optimization of units, such as the channel-estimation filters, as a function of the SNR and mobile velocity
The combining algorithm

Step 2: Optimize the AI detector performance. Detecting the AI is fundamentally different from the detection of any other physical channel or L1 indicator. All of the latter are +1/−1-valued signals that only require a sign decision. On the other hand, the AI is a three-valued (−1, 0, +1) signal that requires a corresponding three-valued decision device. Detecting the AI therefore requires knowledge of the received signal's amplitude, which can vary rapidly due to fast fading.

> FIXED-POINT DESIGN The RACH alone demands the optimization of many interdependent algorithms. Note that the applied algorithms significantly affect the resulting modem performance, as well as the design's complexity. Consequently, algorithm choice and optimization is an important first step in the system design of 3G modems. If one uses floating-point models for this purpose, the algorithms can be quickly tested in simulation. This choice also eliminates the need to worry about unnecessary details. Or, one can use commercially available base-station and channel models that have been proven across many designs. This option would help to ensure standards compliance.

The next step toward a hardware implementation is the conversion from a floating- to a fixed-point representation. For this conversion to take place, the values involved must be "quantized." Quantization limits the precision of the values. It therefore limits the achievable performance of the algorithm as well. On the other hand, properly limiting the precision can significantly reduce the complexity of both the design process and the modem.

For a given algorithm, the floating-point representation embodies the best achievable performance. The floating-point model can thus be used as the performance reference for the fixed-point model. This approach provides a clear picture of how much precision can be lost in the fixed-point design. The parameters can then be quantized accordingly.

This capability is especially important if the designer has no absolute performance limits. For cellular mobile communications systems, absolute performance limits exist in the form of conformance test specifications. These specifications stipulate certain tests and their corresponding performance limits. But these standards generally specify only the overall performance limits. Consider a specification for the block error rate (BLER) at the output of the channel decoder. The BLER performance depends on the entire physical layer (analog front end, digital front end, modem, channel decoder, etc.). The standard does not provide modem or codec specifications--only overall performance tests. Thus, no absolute performance references or limits exist for the major sub-blocks that can be used in the design process.

This situation is not problematic if one starts with floating-point models for the sub-blocks. These models can be simulated together to see whether they work as required. A tolerable implementation loss with respect to the floating-point model can then be specified as the design criterion for the fixed-point model.

VERIFICATION Continuing the design flow, the receiver hardware and software models both have to be refined using parallel development strategies. For the hardware models, begin with floating-point models and then quantize to fixed-point models. Finally, create bit- and cycle-true fixed-point models. For the software, it's best of start with untimed functional models and convert them to timed transaction-level models. In each refinement step, verify the refined models against the models from the previous step. This will help ensure that the final design meets the given performance criteria.

The level of detail required to model the hardware/software interface increases accordingly. The simpler models used for algorithm verification can employ the dataflow-simulation technique. The bit- and cycle-true hardware and the timed software models can be implemented in a simulation tool, such as SystemC. With an appropriate simulation tool, one can combine all of these models in one top-level simulation model. This integration will greatly simplify the verification process, thereby speeding up the design flow.

The extreme complexity of 3G wireless systems demands that the design flow begin at the system level with algorithm optimization. The most efficient way to optimize algorithms is to simulate them in the form of floating-point models along with available base-station and channel models. Using standards-based models for the latter helps to guarantee standards compliance from the beginning of the design flow. At the end of this design phase, the optimized floating-point models can serve as performance gauges for all subsequent implementations of the design.