Every engineer knows the importance of having the right tools for the job. ARM big.LITTLE processing gives designers those tools. By taking advantage of a high-performance processor for compute-intensive tasks and a highly energy-efficient processor for less demanding jobs, design teams can extend battery life by up to 70% for applications with highly variable workloads, such as smart phones.
However, the processor architecture is only one part of the toolkit. Software teams also must be able to develop, optimise, and integrate code to get the best out of the big.LITTLE multicore architecture.
By using Virtualiser Development Kits (VDKs), design teams can start developing software up to 12 months before the hardware is available. VDKs allow software developers to simulate complex software stacks, such as Linux, Android, and multicore task migration software, using real-world user scenarios, ensuring they achieve the right balance between energy efficiency and top-end performance.
High Performance And Extreme Energy Efficiency
Attempting to design a processor that addresses both very high performance and extreme energy efficiency can result in an architecture that doesn’t quite achieve either. To meet that challenge, the big.LITTLE processing concept combines a “big” high-performance processor (ARM Cortex-A15 MPCore processor) and a “little” energy-efficient processor (ARM Cortex-A7 MPCore processor) in an asymmetric, heterogeneous, multicore system.
The two processors share the same instruction set architecture (ISA). An interconnect that supports full cache coherency couples them. A shared controller directs interrupts to the active processor (Fig. 1). The architecture enables software developers to automate task migration between each processor cluster and, when appropriate, allocate a single execution environment across both clusters (multiprocessing).
Developing Software For Multicore
The key challenge for software teams when creating any new system-on-chip (SoC) is to develop code before their target hardware exists. Even after the hardware becomes available, it can be difficult for developers to get the visibility they need into what their code is doing for effective and productive debug.
To get the best out of the big.LITTLE architecture, developers must decide how to exploit variances in application workload before they can even get their hands on real silicon. To do that, they need an environment that lets them see into the device so they can clearly observe how allocating tasks between the processors affects both performance and power.
The Virtual Prototype Advantage
Virtual prototypes are fast, fully functional software models of entire systems. They run the same code that the design team will port to the hardware when it becomes available. Because virtual prototypes don’t depend on the physical hardware, design teams can make them available to the software team 12 months or more in advance of the silicon being ready, enabling a time-to-market advantage and an opportunity to win market share over competitors.
Virtualiser Development Kits
VDKs are software development kits (SDKs) with a virtual prototype as a simulation target. A VDK for a specific design includes the virtual prototype for that design, the right set of multicore debug and analysis tools, and sample software.
The Synopsys VDK Family for ARM Cortex processors includes a VDK for ARM big.LITTLE processing (Fig. 2). This VDK includes a complete virtual prototype representing a big.LITTLE processing Versatile Express board. This virtual prototype is built with Fast Models from ARM and DesignWare models from Synopsys. These models enable design teams to rapidly create virtual prototypes for most common mobile and consumer application platforms.
The VDK also includes multicore debugging and analysis tools, which provide full control and visibility. They synchronise debug across all processors and other components in the platform as well. Design teams can get up and running quickly with the VDK by modifying the sample software stacks for Linux, Android, and task migration that are available “out of the box.” The VDK is easy to configure and extend. Design teams can add their own peripherals and change the configuration of the big.LITTLE architecture.
Early Model Availability
Design teams can only deploy virtual prototyping environments if they have access to software models of the processors and the other components that the system comprises. Because ARM develops its processor models as part of its processor development, the Fast Models are available at the same time the processor launches.
ARM uses an identical validation suite for both Fast Models and RTL, which ensures the fidelity of the models to the hardware. Synopsys provides a comprehensive range of DesignWare interface IP models, including USB 3.0 and GMAC, to complement the application subsystem design.
To maximise the benefits of big.LITTLE processing, it is important to tune the task migration strategy toward the specific use cases of the device in which it is deployed and the profile of the individual user.
Typically, the Linux Dynamic Voltage and Frequency Scaling (DVFS) function controls task migration by treating the Cortex-A15 and Cortex-A7 processors as two different power states. Linux provides multiple governors that can control the transition between these states. The governors include high performance, power saving, on-demand performance, and user-space.
Having the kernel decide about the migration, however, has both advantages and disadvantages. An advantage is that it will work “out of the box” for any application. Depending on pre-defined CPU load threshold and workload sampling rate, Linux will initiate the task migration between the two clusters. The disadvantage is that migration may happen even though the user does not benefit. This is where software developers can add user-space governors to fine-tune the task migration.
The best strategy to trigger task migration between the clusters is typically a mix of kernel and governors in the Android power manager. For example, if the phone is idle, the screen is locked, and an RSS feed is updated, the power manager will make sure the processing is performed on the Cortex-A7 since the user is not waiting for the result. In the case of video playback, the power manager will ensure that the task doesn’t switch to the Cortex-A7 to avoid potential glitches in the audio or video.
The VDK’s hardware-software debug tool “Active CPU status” window shows at any given moment whether the software tasks are running on the Cortex-A15 or the Cortex-A7 processor (Fig. 3).
In addition, the VDK works with third-party debuggers, such as the software debuggers from ARM and Lauterbach, to allow developers to view source code and zoom into bugs. The VDK supports the latest ARM Development Studio 5 (DS-5) Debugger, a software development tool suite that simplifies the development of Linux and Android native applications for ARM processor-based systems.
This use case illustrates how by combining software models, a debugger, and the task migration software stack, the VDK enables software developers to get an early start on analysing and optimising tasks and fine-tuning them between the big and little processor clusters to get the best possible performance and energy efficiency from the subsystem.
ARM big.LITTLE processing offers design teams unique capabilities to balance performance and energy efficiency for demanding applications with highly variable computational loads. The combination of ARM processors and Synopsys VDKs gives software developers the right control, visibility, and speed to bring up and debug software quickly and begin developing software up to 12 months before hardware availability.