Arm provides two families of GPUs, Utgard and Midgard. The Utgard provides graphics display support while the Midgard is the more advanced GPU with user computational capability as well. Arm's latest Mali-T658 architecture (Fig. 1) is the new top end for the Midgard family. It targets the superphone and midrange smartphone platforms (Fig. 2) as well as embedded multimedia mobile devices.
The Mali-T658 core doubles the number of cores, versus the Mali-T604, to eight and doubles the number of arithmetic units to four (Fig. 3). It also employs hardware-based job scheduling (Fig. 4) that has the capability to turn cores on and off thereby reducing power requirements. This is normally handled in software.
The new 8-core family meshes well with Arm's big.LITTLE announcement (see Little Core Shares Big Core Architecture). The Mali-T658 can be combined with the low power Cortex-A7 and the powerful Cortex-A15 using Arm's CoreLink Interconnect. There is cache coherency between the GPU and CPU. Likewise, the MMU page table setup the same for both platforms. The Mali-T658 is also compatible with 64-bit ARMv8 architecture.
The arithmetic units can handle double precision floating point values. This is key for API support of OpenCL, Google RenderScript, and Microsoft DirectCompute. More applications are staring to take advantage of this type of computational capability.
The Mali-T658 along with some combination of Cortex cores will compete with platforms like NVidia's Tegra 3. The Tegra 3 has a ULP (ultra low power) GeForce GPU with up to 12 cores. Four Cortex-A9's are on the CPU side of the Tegra 3.
AMD's Fusion APU (Accelerated Processing Unit) does not target smartphones but it does combine CPU and GPU. The GPU supports OpenCL in addition to providing display graphics support.
Now the challenge for developers is how to balance graphics acceleration with computation acceleration.