LTE Comm Processor Implements Mutli Dispatch VLIW Architecture

Tensilica's latest processor for the LTE communications market implements a mutli dispatch VLIW architecture.

Feb. 7, 2011

5 min read

1 of Enlarge image

ConnX SSP16 Soft Bit Processor

Tensilica has delivered a range of processor architectures like the Xtensa LX (see Power Play For The SoC Developers) as well as DSP and the ConnX comm processor (see Dataplane Processing Unit More Flexible Than DSP). Its latest ConnX processor suite includes four platforms that are combined to address the LTE market. They include the ConnX BBE64-128, the ConnX Turbo16, ConnX SSP16, and the ConnX BSP3. Each targets an aspect of an LTE base station to provide optimum configurability, power usage and performance while minimizing chip real estate.

The high end, 28nm ConnX BBE64-128 delivers 100 GigaMACs (multiply-accumulates) performance and it is designed as the central work horse of an LTE (Long-Term Evolution) Advanced system. It employs a multi slot VLIW architecture that is more akin to a CISC architecture with multiple pipelined execution units. The fixed size VLIW instructions are split into sub-instructions or slots for a particular type of execution unit like a normal VLIW architecture. The difference is that there can be more than one execution unit per type and an instruction slot does not map to a particular execution unit. Instead, the slots are dispatched in a fashion similar to a CISC system using an idle execution unit.

The execution units are also pipelined and interlocked so the programmer and compiler do not have to contend with race conditions. A typical VLIW system normally executes each instruction in a single cycle or has the compiler handle multiple cycle executions. Tensilica's approach is easier to contend with and does not have problems with interrupt handling because instructions will always complete and another instruction will not mess up the process.

The compiler can optimize performance by making sure that the right number of instruction slots are filled with code that will execute efficiently. This is similar to RISC architectures where instruction ordering can improve performance where a subsequent instruction might cause the system to idle while waiting for another instruction to complete.

Essentially the decoder takes the instructions in each slot and tries to assign them to an execution unit. A VLIW instruction will not complete until all the slots have been given over to an execution unit. ConnX BBE64-128 has two 64-bit MAC units, 4 SIMD ALUs, and 4 regular ALUs.

This same approach is taken with the other VLIW-based architectures like the ConnX SSP16 Soft Bit Processor (Fig. 1). It has a two slot VLIW instruction that drives a SIMD unit and two ALUs.

The ConnX BBE64-128 also incorporates a range of new features. Its “soft bit” vector data types support operations including arbitrary field insertion and extraction for complex transmit operations. It has rarallel register files for 10/20-bit and 40-bit data types. There are single-cycle 16-way complex radix-4 and radix-8 FFT (fast Fourier transform) and DFT (discrete Fourier transform) instructions. The instruction set supports interleaving for all bit, byte, half-word and word vector types for flexibility and efficiency in HARQ (hybrid automatic repeat request), forward error correction and convolutional coding found in LTE applications. The AXI interface allows for easy shared memory connection design when incorporating other cores.

The ConnX SSP16 Soft Stream Processor targets channel encoding as well a modulation and demodulation chores. It includes a 16-way SIMD baseband core optimized for the processing of soft bits. It can accelerate wireless communication PHY routines such as Viterbi, HARQ, and de-rate matching.

The ConnX BSP3 Bit Stream Processor is about one-quarter the size of the SSP16 and usually handles channel decode chores. It is designed for processing and control of bit streams and can accelerate wireless communication PHY routines such as bit mapping, bit interleaving and turbo encoding.

The multi-standard ConnX Turbo16 Turbo Decoder (ConnX Turbo16) is about the same size as the SSP16 but tailored as a programmable turbo decoder for LTE and HSPA+. It can achieves 150 Mbit/s decoded bit rates.

No one core address LTE well but this combination efficiently covers base station design.

Tensilica

About the Author

William G. Wong

Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form.

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below.

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence.