ARM's Mali GPU architecture is designed to complement the Arm Cortex processor line. The multicore Mali architecture has been available for some time (see Multicore Mobile GPU Handles Computation Chores) and the new Mali-T600 line (Fig. 1) that supports up to 8 cores provides significant performance improvements while reducing power requirements. Its 64-bit double precision support is closely linked to the 64-bit ARM Cortex architecture (see ARM Joins The 64-bit Club). The architecture provides options for more numeric computational architecture with the Mali-T678.
The Mali-T678 doubles the number of arithmetic pipelines in each core (Fig. 2) compared to the Mali-T624 and the new Mali-T628. The various pipelines address different types of computational chores with some, like the texture pipeline, addressing video-specific jobs. The utilization of the pipelines depends upon the applications involved and cores can be used for a range of work allowing the GPU to simultaneously handle display as well as computational work. All the platforms can provide computational services but the Mali-T678 is simply better in this respect.
The Mali-T600 increases performance while implementing a more power efficient architecture. Part of this is done using cache coherency between the CPU and GPU. It means the CPU does not have to flush cache to share data with the GPU. The cache support extends to Arm's big.LITTLE CPU architecture (see Little Core Shares Big Core Architecture).
The Mali-T600 can handle a 20% clock frequency increase. It can also handle 4K2K resolutions allowing Arm SoCs to be employed within future, very high resolution displays.
One aspect of the Mali-T600 that helps in both performance and power management is support for Adaptive Scalable Texture Compress (ASTC). The Khronos Group, known for standards like OpenCL and OpenGL, is handling ASTC (see Khronos Releases ASTC Next-Generation Texture Compression Specification).
ASTC is royalty free and is designed to provide higher compression rates with lower overhead thereby reducing the computational requirements allowing lower power utilization for a particular job. It is integrated with OpenGL ES and OpenGL 3D graphics APIs. ASTC is very efficient and can encode a wide variety of texture formats. It can address data encoded as 8 bits per pixel to below 1 bit per pixel.
ASTC handles a range of formats including monochrome, luminance-alpha, RGB and RGBA, X+Y and XY+Z for surface normals. A block of pixels in the image gets its own encoding method providing a very efficient implementation in terms of density as well as quality and computation. This allows fractional-bit encoding and dynamic tradeoffs to be employed based on the image that can vary widely.
Mali-T600 has drivers that support the full OpenCL Profile as well. OpenCL is used for GPU computation algorithms although OpenCL is not restricted to GPUs. Arm is also supporting Renderscript on the Mali-T600.
Designers will definitely want to incorporate the Mali-T600 architecture into new designs because of its features but its improved hardware support for OpenCL is significant. Other GPU vendors including AMD and NVidia have been pushing their architectures in this direction for awhile. The Mali-T600 is not in the same league but then again the SoC using it will not likely need the power and cooling system required for something like NVidia's new Kepler architecture (see GPU Architecture Improves Embedded Application Support). Those boards are found in 10 rack configurations consuming just a few watts of power. Actually it is more like 400 kW.
The Mali-T600 may not be used to compute where to drill for oil or compute when the next ice age might occur but it can improve everything from cleaning up a photo to making the AI sidekick in a video game much more intelligent.