VLIW DSP Engine Delivers 1 GFLOPS

July 7, 2003
The complex computational block on Atmel’s mAgic DSP core consists of four integer/floating-point multipliers, an adder, a subtractor, and two add-subtract integer/floating-point units. The subsystem includes two shift/logic units, a min/max...

The complex computational block on Atmel’s mAgic DSP core consists of four integer/floating-point multipliers, an adder, a subtractor, and two add-subtract integer/floating-point units. The subsystem includes two shift/logic units, a min/max operator, and two seed generators for efficient division and inverse square-root computations. (See the figure).

To support the computations, the core also includes an on-chip 8-kword by 128-bit program memory, a large multiported register file, an 8-kword by 80-bit data memory, multiple address generators, and an interface for a host CPU. The hardwired complex multiplier-accumulators perform 100 million complex multiply-accumulates/s.

The block’s architecture is optimized to natively handle complex arithmetic (single-cycle complex multiply or multiply and add), single-cycle butterfly computations for fast Fourier transforms (FFTs), and vector computations. Peak performance is achieved during the FFT butterfly computations, when 10 floating-point operations are executed every clock cycle.

Two independent address-generation units on the mAgic core allow the engine to generate two address pairs, one to access the left and right memory for reading the complex values (real and imaginary numbers) and one to access the left and right memory for writing. The units support indexed addressing, linear addressing with stride, circular addressing, and bit-reversed addressing.

The mAgic core operates in two modes, the program load mode (system mode) and the execute (run) mode. In the program load mode, the core acts like a memory-mapped peripheral and is loaded through the host processor interface. Once loaded, the core is switched into the run mode and operates as a co-processor to the host.

When used with an ARM processor, the core behaves as a standard AMBA slave device, allowing access to different resources depending on the operating mode. In the system mode, the ARM core can access any internal mAgic register or resource and read or write data. As a result, the ARM core can initiate operations or perform debugging. In the run mode, the mAgic core runs under control of its own very-long-instruction-word (VLIW) program. The ARM host only has access to a 1-kword by 40-bit dual-ported shared memory and to the mAgic command register.

About the Author

Dave Bursky | Technologist

Dave Bursky, the founder of New Ideas in Communications, a publication website featuring the blog column Chipnastics – the Art and Science of Chip Design. He is also president of PRN Engineering, a technical writing and market consulting company. Prior to these organizations, he spent about a dozen years as a contributing editor to Chip Design magazine. Concurrent with Chip Design, he was also the technical editorial manager at Maxim Integrated Products, and prior to Maxim, Dave spent over 35 years working as an engineer for the U.S. Army Electronics Command and an editor with Electronic Design Magazine.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!