Attaining x86 compatibility is often a power-to-performance tradeoff, one that Intel addressed with a new Pentium M line of processors and support chips that targets an array of embedded applications. Intel has been listening to embedded designers and keeping an eye on the competition. The result is the new Pentium M. It's based on the Pentium III P6 micro-architecture, but with significant changes. Now it can deliver better performance while consuming less power than alternative processors, especially in embedded environments. For example, the Pentium M actually shortens the execution pipeline. This may seem counterintuitive, but consider the environment. Real-time embedded applications tend to have more interrupts and task switches compared to a desktop environment. The pipeline needs to be flushed more often in the embedded environment, so shortening the pipeline minimizes this overhead. For many applications, the Pentium M will run faster than a comparable-speed Pentium 4.
The Pentium M incorporates a large 1-Mbyte level 2 cache. In fact, it's twice the size of that in a Pentium III. The cache size is key to maintaining a significant portion of an embedded system on-chip, boosting performance while actually reducing power requirements. The large cache size also helps in embedded applications that require large lookup tables, such as routers and switches.
Branch prediction is now done using three different techniques: local, bimodal, and global. Local branch prediction addresses tight loops found in most applications. This is common among processors supporting branch detection. Bimodal branch prediction tracks each branch to see if the branch is taken more often than not. This is important in conditional code within loops. Finally, global branch prediction operates at the program level and tracks the style of branches being used to optimize execution.
The dedicated stack manager implements the advanced stack pointer (ASP) architecture. It's designed to eliminate the need for many load/store micro-ops typically used in stack related operations.
A term like micro-op fusion is a great name for a straightforward performance enhancement. It essentially allows more than one micro-op in an instruction stream to be performed in parallel. Actually, the technique works on pairs of micro-ops so that it's possible to execute two at once, assuming they're not sequentially dependent. Examples include execution of address-store and data-store micro-ops, or a load and an add.
The enhanced SpeedStep implementation controls the chip's voltage and frequency. This is a common technique found in microcontrollers, particularly very low-power products, but less common in standalone processors like the Pentium. It's less sophisticated than the software-based, LongRun technology used by Transmeta (www.transmeta.com).
The Pentium M has some missing Pentium 4 features, most notably its hyperthreading technology. The Pentium M works with a number of chip sets designed for different environments. The E7501 is designed for embedded applications. It supports up to 4 Gbytes of DDR memory with error-correction-code support. It can handle up to three 64-bit PCI/PCI-X controller hubs. This can be very handy in blade servers and communication systems. The Pentium M is part of the Centrino mobile technology when combined with the 855 chip set and its integrated video support.
The 1.6-GHz version is $625. The low-power 1.1-GHz version costs $257. The latter is designed to use half the power of the 1.6-GHz version. Now appearing everywhere.