Vector Processor Makes Short Work Of Complex DSP Algorithms

Sept. 15, 2003
With four vector pipelines, Telairity Semiconductor's TVP400 combines the signal-processing features of a high-performance DSP and the control capabilities of a 32-bit microcontroller (see the...

With four vector pipelines, Telairity Semiconductor's TVP400 combines the signal-processing features of a high-performance DSP and the control capabilities of a 32-bit microcontroller (see the figure). It will be available later this year for use in ASIC applications as a 4- by 4-mm "hard" core. Initially, it will be offered using 0.13-µm design rules to ensure operation at 600 MHz.

The highly parallel architecture allows the TVP400 to keep up to 23 operations in flight at the same time. When running at 600 MHz, the core can execute a 256-point complex fast Fourier transform in just 2.1 µs, or a 64-tap finite impulse-response filter in just 29 ns/result (continuous).

The core was crafted via Telairity's internally developed building-block design scheme. The approach includes a well defined methodology, tools, and a complete library of pre-engineered, fully characterized hard IP building blocks that are reusable, generic, and portable.

To ensure that the processing blocks won't stall due to a lack of data, designers included 128 kbytes of SRAM (128 memory banks, each 512 words by 16 bits). It supplies data to the vector pipeline through a crossbar switch that permits eight memory reads and four writes simultaneously. To support the scalar processor, designers also included 16-kbyte instruction and data caches. Both eight-way set-associative caches have prefetch and lock capabilities.

Rather than use one of the commercial scalar 32-bit cores from ARM, ARC, MIPS, or another source, Telairity crafted its own scalar engine, since it could be optimized to control the four vector pipelines as well as perform system control operations. The quad-pipeline vector engine can operate on vectors with a length of up to 32 bits or operate on four 32-bit vectors in a single-instruction/multiple-data mode so one instruction delivers 128 results. The vector engine can also operate very efficiently with short vectors as well as perform gather-scatter operations to help arrange data in memory to achieve the highest computational efficiency.

Each of the four vector pipelines can simultaneously perform four load operations and two store operations. The resources in each pipeline include 16 vector registers (each contains 32 16-bit elements), two adders with 24-bit accumulators, and one multiplier-accumulator with 40-bit accumulation. An integrated 1-Mbyte ROM can be used to hold the application program, so the core can operate as a dedicated application processor when embedded in a larger chip design.

Telairity Semiconductor Inc.www.telairity.com

Sponsored Recommendations

Highly Integrated 20A Digital Power Module for High Current Applications

March 20, 2024
Renesas latest power module delivers the highest efficiency (up to 94% peak) and fast time-to-market solution in an extremely small footprint. The RRM12120 is ideal for space...

Empowering Innovation: Your Power Partner for Tomorrow's Challenges

March 20, 2024
Discover how innovation, quality, and reliability are embedded into every aspect of Renesas' power products.

Article: Meeting the challenges of power conversion in e-bikes

March 18, 2024
Managing electrical noise in a compact and lightweight vehicle is a perpetual obstacle

Power modules provide high-efficiency conversion between 400V and 800V systems for electric vehicles

March 18, 2024
Porsche, Hyundai and GMC all are converting 400 – 800V today in very different ways. Learn more about how power modules stack up to these discrete designs.

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!