Electronic Design

VLIW DSP Engine Delivers 1 GFLOPS

The complex computational block on Atmel’s mAgic DSP core consists of four integer/floating-point multipliers, an adder, a subtractor, and two add-subtract integer/floating-point units. The subsystem includes two shift/logic units, a min/max operator, and two seed generators for efficient division and inverse square-root computations. (See the figure).

To support the computations, the core also includes an on-chip 8-kword by 128-bit program memory, a large multiported register file, an 8-kword by 80-bit data memory, multiple address generators, and an interface for a host CPU. The hardwired complex multiplier-accumulators perform 100 million complex multiply-accumulates/s.

The block’s architecture is optimized to natively handle complex arithmetic (single-cycle complex multiply or multiply and add), single-cycle butterfly computations for fast Fourier transforms (FFTs), and vector computations. Peak performance is achieved during the FFT butterfly computations, when 10 floating-point operations are executed every clock cycle.

Two independent address-generation units on the mAgic core allow the engine to generate two address pairs, one to access the left and right memory for reading the complex values (real and imaginary numbers) and one to access the left and right memory for writing. The units support indexed addressing, linear addressing with stride, circular addressing, and bit-reversed addressing.

The mAgic core operates in two modes, the program load mode (system mode) and the execute (run) mode. In the program load mode, the core acts like a memory-mapped peripheral and is loaded through the host processor interface. Once loaded, the core is switched into the run mode and operates as a co-processor to the host.

When used with an ARM processor, the core behaves as a standard AMBA slave device, allowing access to different resources depending on the operating mode. In the system mode, the ARM core can access any internal mAgic register or resource and read or write data. As a result, the ARM core can initiate operations or perform debugging. In the run mode, the mAgic core runs under control of its own very-long-instruction-word (VLIW) program. The ARM host only has access to a 1-kword by 40-bit dual-ported shared memory and to the mAgic command register.

TAGS: Digital ICs
Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.