Broadband wireless is now a reality. Towers are converting to 3G throughout Europe and Asia. 3G-enabled phones are flying off the shelf and service providers are making a substantial commitment to the new format. At the same time, consumers are happily embracing all of the new features and functionality that broadband wireless delivers.
Everyone wants to know what will drive adoption; give service providers and equipment vendors new revenue streams; and increase usage in both subscriptions and minutes used. Obviously, it will be something that leverages the broadband capabilities of the 3G network architecture—namely, mobile real-time multimedia communication.
At lower speeds, SMS and MMS work fine. Imagine if users were comfortable with non-real-time multimedia. 3G would be unnecessary, as 2G and 2.5G provide sufficient bandwidth—as long as the streams can be cached and reconfigured at the end point.
Obviously, the real applications for 3G are the real-time ones. They include video telephony (videoconferencing), video streaming, remote wireless surveillance, multimedia real-time gaming, video on demand, and more (FIG. 1). These "killer apps" will drive usage and increase service-provider revenue. Correspondingly, they also will raise equipment sales.
Originally, 3G was conceived as an "all-IP" solution by both the Third Generation Partnership Project (3GPP) and Third Generation Partnership Project 2 (3GPP2). In reality, however, 3G multimedia is not enabled by the IP protocol (SIP). The problem is that IP communications are sensitive to high bit error rates (BERs). Such high rates are found throughout the public cellular network.
Using IP as the underlying transport results in poor quality for conversational real-time communication. As an alternative, 3G real-time multimedia is now delivered over a circuit-switched protocol. Called 3G-324M, this protocol provides adequate quality for any sort of latency-sensitive applications.
3G-324M enables conversational real-time multimedia over third-generation (3G) technology. The 3G real-time multimedia services based on 3G-324M started in Japan. They have now expanded to the U.K., Italy, Australia, and Spain. They're currently finding footholds in more and more countries. Correspondingly, the subscriber growth rate continues to grow per month in each of these markets. In addition, more and more 3G-324M-enabled mobile terminals (3G PDAs, smart phones, and feature phones) are becoming available.
Instead of operating over IP protocol, as most communication protocols do today, 3G-324M operates over a Time Division Multiplexing (TDM) circuit-switched (CS) channel (FIG. 2). That channel is opened by the baseband protocol between communicating peers. TDM has the benefit of a fixed-low-delay service, which saves the need for routing on every hop of the IP's communication path. For low-fixed-delay services in a high-bit-error-rate environment, such as conversational voice and video, it has been found to operate well in public cellular networks.
3G-324M may be a throwback to the circuit-switched world and not "next-generation IP." Unlike IP, however, it does work for conversation video calls in a cellular network. It also allows service providers and equipment developers to enable the broadband killer applications that were previously mentioned. Those applications are delivered as a hybrid of communication technologies based on IP and 3G-324M CS.
To put it simply: IP isn't ready to support real-time multimedia over wireless. Take video telephony, for example. It requires medium to high bandwidth, low delay (two-way), medium to high quality, and a continuous connection. To provide video that's acceptable to the mobile user, the wireless network must provide a certain quality of service (QoS). Frame-delay variation, bit errors, and frame loss can have severe effects on the video quality.
Even in the fastest CDMA2000 EV-DO network, 3G video telephony for conversational multimedia communications over IP was found to be unsuitable. By "packaging" many bits into an IP packet that is carried over a public mobile network, this approach incurred a large rate of packet loss. Because IP needs to be processed for addressing in each hop on the network, each packet loss and the request to retransmit caused additional delays. In a real operational scenario of public cellular networks, the overall delay that's caused is unacceptable for conversational multimedia services.
Often, the IP packet has too many bits that cannot be recovered. The entire packet then needs to be retransmitted. As a result, the multimedia experience becomes unacceptable. Yet when a circuit-switch-based time-division-multiplexing session was opened between two communicating peers, it was found to operate well in the same bit-error-rate conditions. Of course, this session boasted suitable error detection and correction for the bits (H.223 Annexes A and B) and concealment for the codecs (MPEG-4, SP-L0, and GSM-AMR).
3G-324M's role as the video-telephony enabler to all 3G technologies is now becoming clear. Consider a transmission link with a BER of 10−5. It might be acceptable for non-real-time data transmission with some form of error correction. In a video stream, however, this error rate would cause a serious degradation in the quality of the received video. Frame-delay, frame-loss, and rate-control issues also have a significant impact on the quality of the video that's received. Simulation is needed to assess the picture quality under different propagation channels along with error-correction and/or concealment schemes.
To understand the role of 3G-324M today, it's important to know the technology's background. For conversational multimedia services in a 3G network, 3G-324M operates on an established circuit-switched channel between source and destination parties. Multipoint communication between more than two 3G-324M terminals is possible. It requires both a gateway-to-IP network and an H.323 multipoint control unit (MCU).
3GPP, which supports UMTS technology, originally defined 3G-324M as part of Release '99 in December 1999. In August of 2002, it approved 3G-324M usage for CDMA2000. In 2003, TD-SCDMA become a 3GPP standard. It adopted 3G-324M and began to operate in China as the formal 3G standard.
3G-324M is an addressless protocol. It doesn't include the call setup with the baseband. The call setup for the protocol is defined in the following specifications:
- 3GPP TS 24.008: Mobile radio interface Layer 3 specification
- 3GPP TS 27.001: General on Terminal Adaptation Functions (TAF) for Mobile Stations (MSs)
- 3GPP TS 29.007: General requirements on interworking between the public-land mobile network (PLMN) and the integrated-services digital network (ISDN) or public-switched telephone network (PSTN)
- 3GPP TS 23.108: Mobile radio interface Layer 3 specification core network protocols; Stage 2 (structured procedures)
3GPP defines UMTS/W-CDMA solution architectures. The organization's codec working group (TSG-SA/WG4) was responsible for the specification of the visual phone. It defined 3G-324M. As a baseline of the terminal specification, ITU-T H.324M was adopted.
ITU H.324 was defined for visual PSTN phone terminals. Initially, it was developed for PSTN with the V.34 modem protocol. Its mobile extension is defined as H.324M (originality called H.324 with mandatory support of Annex C). H.324M was realized with the improvement of error resiliency to the multiplexing protocol, which is defined in Annex A/B/C to H.223.
The major sub-protocols and procedures of 3G-324M are:
- Error-resilience services
- H.223 multiplexing/de-multiplexing protocol
- ITU-T H.245 call control
- Optional codecs to be used: MPEG-4 Simple Profile, H.263 for video, and adaptive multi-rate (AMR) for audio
For mobile conversational multimedia communication, error resilience is essential for error detection and concealment on the fly. H.223 provides Annexes A, B, C, and D for such services. Annexes A and B define the handling of light to moderate BER levels. These annexes were made mandatory by 3GPP. They're commonly used by vendors today.
In addition, MPEG-4 video provides tools for error resilience. It thereby minimizes the video-quality degradation that is caused by errors. These solutions don't reduce errors like forward error correction (FEC) or automatic repeat request (ARQ). But they can reduce the damage on decoded video quality.
For instance, MPEG-4 Visual (ISO/IEC 14496-2) is a generic video codec. One of its target areas is mobile communications. Error resiliency and high efficiency make this codec particularly well suited for 3G-324M.
In contrast, MPEG-4 Visual is organized into profiles. Within a profile, various levels are defined. The profiles define subsets of toolsets. The levels are related to computational complexity. Among these profiles, Simple Visual Profile provides low complexity and error resiliency (through data partitioning, reversible variable-length coding (RVLC), a resynchronization marker, and header extension code). MPEG-4 allows various input formats including general formats like quarter common intermediate format (QCIF) and common intermediate format (CIF). It also is baseline compatible with H.263. Details on the MPEG-4 error-resilience services follow:
The resynchronization marker can reduce the error propagation caused by the nature of variable-length code (VLC) into a single frame. In MPEG-4, the resynchronization marker is inserted at the top of a new group of block (GOB) with the header information (macro-block, or MB, number and quantization parameters) and optional header extension code (HEC). Decoding can then be done independently. It's a good idea to place the resynchronization marker before important objects like people. This approach will improve error resilience with a minimum increase of overhead.
Byte alignment: Bit stuffing for the byte alignment provides additional error-detection capability through its violation check.
Data partitioning: A new synchronization code, which is named motion marker, separates the motion-vector (MV) and discrete-cosine-transform (DCT) fields. In this way, it prevents inter-field error propagation. Effective error concealment can therefore be performed. When errors are detected solely in the DCT field, the MB will be reconstructed using correct MV. Compared to the simple MB replacement of the previous frame, this approach results in better natural motion.
Reversible variable-length code (RVLC): The RVLC is designed to enable forward and backward decoding without significantly impacting coding efficiency. Ideally, this feature localizes error propagation into a single macro block.
Adaptive intra refresh (AIR): In contrast to the conventional cyclic version, AIR employs motion-weighted intra refresh. It results in better perceptual quality along with the quick recovery of corrupted objects.
Error detection and concealment: Errors can be detected by exceptions or violations in the decoding process. Concealment will then be applied. This functionality is included for mobile applications. The code point of H.324 can support MPEG-4 audio, thereby making it usable for an H.324 mobile-phone terminal.
A 3G-324M protocol is initialized after a circuit-switched channel is opened between two communicating parties. The H.223 multiplexing protocol is the first to be established between those parties. After initiating this protocol, the multiplexing process must be synchronized between the communicating parties. It's also important to establish the call control (H.245) as the first logical channel to be opened (channel 0).
The basic function of the multiplexing protocol is to interleave multiple media streams into a single stream. Such media streams could include video, speech, user data, and control signals (H.245). That single stream can then be sent over a transmission channel. 3G-324M uses the ITU-T H.223 mobile extensions of Level 2 as its multiplex protocol.
H.223 has a flexible mapping scheme that's suitable for a variety of media and a variable frame length. In its mobile extension, it flaunts stronger synchronization and control against channel errors without losing its flexibility. Three operation modes exist from Level 0 to Level 3. They are categorized according to their degree of error resiliency.
Multiplexing Level 0 is identical to the H.223 specification. It provides multiplexing and QoS functions that are appropriate for each media data. Two layers, which are known as the adaptation and multiplexer (MUX) layers, realize these features (FIG. 3). Three types of adaptation layers are defined according to their media type (video, speech, or data):
- Adaptation Layer (AL) 1: User data and control signals (H.245). This AL assumes that the upper layer provides error control.
- Adaptation Layer 2: Speech. Error detection and a sequence-numbering mechanism are provided.
- Adaptation Layer 3: Video. This layer offers error detection, sequence numbering, and ARQ.
The multiplexer layer assembles multiple media packets into a single bit stream according to the selected multiplex pattern. That pattern is chosen out of up to 16 multiplex patterns. The MUX pattern can be defined arbitrarily through the session negotiation procedure.
Header information is attached in order to control such a flexible multiplexing mechanism. It consists of a 4-b multiplex code (MC), 1-b packet marker (PM), and 3-b parity (HEC). As a delimiter of multiplexer-protocol data units (MUX-PDUs), 8-b HDLC synchronization flags ('01111110') are inserted. To prevent the flag emulation inside the payload, stuffing is defined ('0' bit insertion after every five succeeding '1s').
In Multiplexing Level 1, a 16-b PN sequence is used instead of an 8-b HDLC synchronization flag. It thereby improves the MUX-PDU synchronization over error-prone channels. Stuffing is prohibited to enable an octet-oriented flag search. This modification remarkably improves the flag-detection performance over error-prone channels. But in the case of conflict, there is a slight probability of flag emulation conditions. This multiplexing level is described in H.223 Annex A to overcome light error-prone channel for detection and concealment services.
In Multiplexing Level 2, MUX-PDU payload length information and FEC for the header is added over the Level 1 modification. As a result, it promises much better synchronization and error resilience. An optional header field, which includes MC/PM/HEC for the previous frame, can be applied to improve error resilience against burst errors through time-diversity effects. This multiplexing level is described in H.223 Annex B. Its goal is to overcome moderate error-prone channels for detection and concealment services. All Freedom of Mobile Access (FOMA) devices, which are approved by NTT-DoCoMo, support Multiplexing Level 2. This multiplexing level is the standard de-facto choice today.
H.245 TERMINAL CONTROL
3G-324M uses ITU-T Recommendation H.245 as the multimedia-communications-control method. H.245 is used in a variety of applications, such as B-ISDN (within H.320), local-area networks (within H.323), and mobile communications. It has a wide variety of communications-control functions. Assuming it's used in an error-free environment, H.245 enables reliable control using in-channel request-response messaging.
Currently, ITU H.245 version 10 is ratified by the SG16 of ITU. A few vendors, such as RADVISION, support this version in their 3G-324M protocol toolkit. The minimal version to be supported is version 3. Support for the higher versions enables a richer set of call-control services. The rising video-codec protocol, H.264, requires advanced H.245 support with generic capabilities exchange.
Because 3G-324M rides on a channel that was opened between two communicating parties, it doesn't need any addressing like H.323. The gateway (e.g., between 3G-324M, H.320, H.323, and SIP) is expected to provide the interoperability between different networks. This gateway can be realized rather easily.
3G-324M for H.245 operation requires Numbered Simple Retransmission Protocol (NSRP) and Control Channel Segmentation and Reassembly Layer (CCSRL) sun-layers support. NSRP is defined in H.324/Annex A. Essentially, mobile terminals shall support the NSRP and the SRP mode. If both terminals start the session in Level 0, the SRP mode shall be used. If Multiplexing Level 2 is used, both terminals shall start with NSRP mode.
The CCSRL sublayer is used for carrying the large H.245 packets that are required for operation. H.245 provides the following functions: master-slave determination, capability exchange, logical channel management, multiplex table management, mode-change request, and miscellaneous commands and indications. The explanation for each function follows:
The master-slave determination figures out which terminal is the master at the beginning of the session. Due to the fact that H.245 is a symmetric control protocol, it's necessary to determine the master terminal. That terminal has the right to decide the conditions in case of conflict.
The capability exchange exchanges the capabilities that both terminals support. These capabilities include the mode of multiplexing; type of audio/video codecs; data-sharing mode and its related parameters; and/or other optional features.
Logical channel signaling opens/closes the logical channels for media transmission. This procedure also includes parameter exchange for the use of this logical channel.
Multiplex-table initialization/modification adds/deletes the multiplex-table entries.
The mode request requests the mode of operation from the receiver side to the transmitter side. In H.245, the choice of codecs and parameters is decided at the transmitter side. This choice takes into account the decoder's capability. If the receiver side has a preference within its capability, this procedure is used.
The round-trip-delay measurement enables an accurate quality-characteristic measurement.
The loop-back test is useful for device test during development or in the field. It helps to assure proper operation.
Miscellaneous call-control commands and indications request the modes of communication, flow control like conference commands, and jitter indication and skew. Or they can indicate the conditions of the terminal to the other side.
To provide these functions, H.245 defines the messages to be used and the procedures for handling those messages. Using Abstract Syntax Notation 1 (ASN.1), H.245 defines each message parameter that effectively provides readability and extensibility. To encode these ASN.1 messages into binary, it utilizes the Packed Encoding Rule (PER). It thereby realizes very bandwidth-effective message transmission. After the multiplexing-level synchronization between the communicating parties is completed, the first logical channel opened (channel 0) is H.245 call control. It has CCRL and NSRP, which ensure that the H.245 channel will be highly reliable. It also will be able to use large packets during operation.
In order to operate 3G-324M services, handheld devices, base stations, gateways, and servers must all support this protocol. The handheld-device category includes all types of handheld devices, including 3G PDAs, smart phones, and feature phones. Those devices are the clients of the service, which can operate video-call initiation and receiving and video on demand (VOD). The service also can operate data-entry services while a call session is active, such as dual-tone multiple-frequency (DTMF) signals for an e-commerce transaction.
The base station is the network device of the operator. It authenticates the served client's handheld devices. The base station also initiates a bridge between the handheld device and a backbone packet-switched and circuit-switched network (with E1/T1 ports interfacing the backbone). The base station should comply with the following 3G-324M-related call-setup specifications:
- 3GPP TS 24.008: mobile radio interface Layer 3 specification
- 3GPP TS 27.001: General on Terminal Adaptation Functions (TAF) for mobile stations (MSs)
- 3GPP TS 29.007: General requirements on interworking between the public-land mobile network and the integrated-services digital or public-switched telephone networks
- 3GPP TS 23.108: Mobile radio interface Layer 3 specification core network protocols; Stage 2 (structured procedures)
The gateway also plays a vital role in 3G-324M's success. It bridges the 3G-324M circuit-switched and IP network (H.323 or SIP) signaling. The gateway translates the call-setup mentioned above into Q.931 (H.323) or SDP (SIP) messages. The call control of the H.245 over CS is transformed into H.245 (H.323) over IP or to SDP (SIP). The codecs are transformed as well. If there are no common codecs in the capability-exchange phase (H.245 or DSP capability-exchange procedure), a transconding is performed.
Lastly, the servers enable add-on services like auditing, video mail, and video on demand. They should have 3G-324M if they're interfaced directly with the base station's CS ports. The servers also may support DTMF to enable user controls like record, playback, or menu operation (OSD).
3G video telephony is paving its way toward the mass market month after month. Every day, mobile-device quality is enhanced, coverage improves, and there are more countries and 3G subscribers to call. The problems of carrying video-telephony communication over IP in a public 3G network won't be solved in the near future. Yet the 3G-324M solution does offer hope (FIG. 4). It is the only working 3G protocol with a rapidly growing number of subscribers. Most importantly, it boasts acceptance by all 3G technology camps including W-CDMA, TD-SCDMA, and CDMA200. Across the globe, equipment developers and service providers are embracing this protocol for their 3G solutions.