The easiest way to get more performance from a DSP engine is to crank up the speed. But between process and architectural limitations, clock speeds can only go up so far. As the circuits run faster, they consume more power, significantly limiting where or how they can be applied. By doing more in parallel at a lower clock rate, very-long-instruction-word (VLIW) architectures can beat the speed and power limitations.
Able to deliver gigaFLOP performance without gigahertz clock speeds, Atmel's mAgic DSP core performs floating-point computations in the complex domain in a single cycle. Based on a VLIW architecture that has up to 10 arithmetic operations running in parallel, the core can perform 1.5 billion operations/s or 1 GFLOPS (40-bit precision) with only a 100-MHz input clock (see the figure).
Also, the core can do all that while consuming less than 500 mW. This low power consumption will let designers co-integrate the core along with a host processor like the ARM, MIPS, PowerPC, or ARC to implement a system-on-a-chip solution.
The ability to deliver 1 GFLOPS at low power opens many opportunities in consumer, military, medical, industrial, and other areas. Improved hands-free phones using audio-beam forming, better hearing aids by using more-complex algorithms based on real-time differential equation solving, and radar-beam forming and anti-jamming are just a few possibilities once the compute throughput is available.
When running at 100 MHz, the mAgic arithmetic core can perform a 1024-point floating-point fast Fourier transform (FFT) in just 5962 cycles (about 60 µs) or deliver 64 outputs from a 64-tap complex finite-impulse-response filter in just 4663 cycles (about 47 µs). The processor performs 40-bit floating-point operations based on the IEEE-754 standard.
To program the core, designers at Atmel crafted a modular application programming environment that includes a simulator, a high-level macro-assembler, visual debuggers/simulators, a C compiler, and a real-time operating system.