Floor planning is among the most crucial steps in the design of a complex system-on-a-chip (SoC), as it represents the tradeoffs between marketing objectives and the realities of silicon at the targeted process geometry. We will begin by describing the impact on Moore’s Law of successive generations of silicon design. Then, we will detail the opportunities that next-generation silicon provides to marketing in creating the functionality of a new design. An explanation of the growing need to implement security in next-generation SoC designs follows. Next, the discussion details the tradeoffs the design architect makes to accommodate large third-party intellectual property (IP) blocks, including memory. It concludes with a description of the give and take between adding peripherals to the SoC and the impact on the SoC’s I/O pads and silicon area.
Table of Contents
- The Harsh Reality Of Moore’s Law
- Marketing’s Contribution To The Floor Plan
- Taking Security Into Consideration
- Floor Planning The Large Blocks
- Accommodating SRAM
- Trading Off Silicon And I/O Pads
- Adding User Features To The Floor Plan
Creating an SoC destined for today’s portable consumer electronics products presents a challenge for both engineering and marketing in a semiconductor company. Both teams must make tradeoffs between market demands, competitive pressure, and the engineering constraints of getting a design into silicon in a narrow market window. Consider the competitive pressure confronting semiconductor manufacturers to create the chips that system manufacturers demanded after Apple introduced the iPad.
The die area and its floor plan drive marketing and engineering to make the tradeoffs needed to bring a design to market. Ultimately, the system architect determines the final die size and floor plan. This discussion will examine the forces that determine what the final floor plan will be. Beginning with the realities of Moore’s Law, the discussion next examines marketing’s contribution to the design requirements. It will then explore silicon and I/O pads cost and the role each plays in determining the SoC’s functionality.
According to Moore’s Law, every 22 months sees a doubling of the number of transistors per square millimeter of silicon. These additional transistors can be used to reduce the cost of the existing functionality, integrate more functions on an SoC, or add new functionality not available until now.
If the strategy is to reduce cost, the SoC’s unit volume has to compensate for the price decrease. To make a point, a 50% reduction in price has to be offset by a 100% increase in unit volume to maintain existing margin dollars. If not, each successive generation of chips will reduce revenue until it is no longer economically feasible to continue production.
A strategy of integrating external functionality onto an SoC becomes economically feasible when integration can offer the market the same functionality at the existing price or less for the capability being integrated. Take the example of integrating an image signal processor (ISP) from a digital camera chip into the SoC’s applications processor. This integration reduces the circuit-board footprint and bill-of-materials (BOM) cost. It transfers revenue from the ISP supplier to the application-processor SoC supplier.
The customer benefits from the cost and power savings achieved through integration as well as the advantages of improved quality inherent in reduced component count. The SoC supplier benefits from the margin dollars that would otherwise go to the external ISP chip supplier. Of course, each component supplier holds the same view in vying for functions to integrate.
Adding entirely new functionality to an SoC offers the greatest potential revenue, because the price is determined by what the market will bear rather than the competitive price pressure of a commoditized feature set. An example of unique functionality is the first smart-phone SoC to integrate a high-definition video codec to compress the video stream generated by the phone’s camera and to play back the various popular Internet video formats. Another example is the first SoC to integrate on-chip, multi-touch tactile input recognition capability. Suppliers of both of these SoCs could command a price that depended on what customers were willing to pay for the unique new features rather than competing based on tolerance for margin reduction.
As chip manufacturing produces new process generations every two years, marketing’s responsibility is to determine points on a roadmap where process technology and market needs intersect. At what process node does a new function become practical and cost little or nothing to add? Moreover, when can an external function, whether mainstream or optional at the time, be integrated into the mainstream next-generation SoC and sold at acceptable margins?
Take the example of a chip manufacturer producing two different mobile phone SoCs, each with a unique radio to serve two different service providers. Once process technology made the silicon area available, marketing made the decision to integrate the two radios onto one chip. Thus, it expanded the total addressable market while cutting the cost of serving these diverse markets and providing added revenue for the product line. The additional area needed for the second radio is offset by operational economies of scale and volume-driven improvement in quality and the added end-user benefit of owning a multi-standard world phone.
Another example illustrates marketing opting to use a process technology advance to integrate on-chip security or configuration data stored in off-chip EEPROM and flash. Integrating an EEPROM or flash is impractical, because it requires additional process steps. However, a one-time-programmable (OTP) anti-fuse memory implemented in a standard logic process can provide the function on chip.
The system manufacturer’s supply chain will see a decrease in BOM cost by eliminating the EEPROM (Fig. 1). However, the major advantage comes from differentiating the SoC from competitive offerings by hiding secure data on chip instead of off chip in tamper-prone EEPROM or flash.
For digital wallet applications, for instance, data such as account information and personal identification numbers must be tamper-proof. Access to online audio and video content requires secure keys to unlock digital rights management control. Cable and satellite set-top manufacturers were early adopters of tamper-resistant, anti-fuse, OTP nonvolatile memory (NVM) to store these keys.
An SoC that has to handle secure payloads may contain secure state machines, DMA and interconnect, and hardware cryptography accelerators such as those that support the Advanced Encryption Standard (AES).
Embedded anti-fuse OTP NVM is small and easy to integrate. It also can be manufactured using a standard CMOS process, thus readily available, even at the 28-nm process node. However, security is the most important benefit an embedded anti-fuse OTP brings to large SoCs targeting consumer electronics.
Unlike floating-gate and eFuse alternatives, anti-fuse OTP is very difficult to content probe using passive techniques or with invasive scanning electronic microscopes. For designs that previously used off-chip EEPROM, embedding anti-fuse OTP will save at least three I/O pads and add security, which is a significant user benefit.
Today’s large SoCs for mobile devices typically can include a complex multicore CPU, an image signal processor, a graphics processor, and video processing for full-motion encode and decode, among other accelerators. Though computing resources are critical elements in any design, the chip architect has more flexibility placing these elements because they have no additional need for direct connection to off-chip resources.
During floor planning, an SoC architect may use a placement tool such as Cadence’s Encounter Digital Implementation (EDI) System to enable automatic creation and implementation of multiple power domains to implement on-chip power management systems (Fig. 2). Block placement within each domain is connectivity aware, with well-connected blocks staying together. This reduces the net length and improves the efficiency of the final layout. The architect may refine the results further by providing the tool with additional constraints such as no overlaps, either within boundaries or with respect to net criticality.
The processing units are a critical element in the SoC floor plan because of the area and power they consume, which directly affects the user experience by comprising a more responsive, more feature-rich, power-efficient device. When it comes to power, the architect also has two components to consider: static and dynamic power consumption.
The foundry process determines static power consumption. All CMOS processes leak whether the circuit is operating or not, so long as the circuit is powered on. Varying the voltage will reduce leakage, but varying it too much affects functional behavior. Dividing the design into power islands and then turning off inactive islands will reduce leakage to zero but may require some state restoration latency when the circuit is to be used again.
Dynamic power is the power required to produce work, whereas static power is the cost of having the power on. We derive dynamic power consumption using a simple equation:
PD = CV2f
where C is the capacitance at the node, V is the voltage at which the node switches, and f is the switching frequency. A common technique for optimizing dynamic power is to employ hardware accelerators to perform functions that would have otherwise been a software-intensive, power-consuming load on the CPU.
One commodity in abundance on next-generation SoCs is SRAM. It often feeds the on-chip processors’ enormous appetite for data bandwidth. SRAM is ideal in SoC designs because it is standard CMOS-compatible and requires no added steps during manufacturing. SRAM is an easy answer to the designer’s question of what to do with the enormous number of additional transistors that become available with each new process generation.
SRAMs provide caches for on-chip CPUs, such as scratch pads for holding motion-estimation results for video codecs and still-image data. SRAMs offer a power-efficient alternative to operating out of external DRAM, especially during sleep mode. External accesses to DRAM consume power from switching I/O voltage. We might add additional SRAM if it lends more capability or power efficiency to the design and there is silicon real estate to accommodate it.
Floor planning the CPU and associated SRAM is but one of the many tradeoffs the system architect makes. Another is budgeting the I/O pad count independently of whether the chip uses a wire-bond pad ring or flip chip-type I/O. There is no value in using more pads. Architects strive to deliver the greatest function in the least number of pads. In DDR3 interfaces, architects will strive to use the narrowest configuration of the bus at the expense of pushing the physical-layer frequency to the maximum possible value and increasing the internal cache size.
To illustrate the tradeoff between consuming silicon real estate and additional pads, consider the simple alternative of eliminating five I/O pads for an external EEPROM—one power, one ground, and three I/Os. At a 60-µm pitch, adding five extra pads increases the die area by approximately 0.3 mm on one 10-mm side of a 100-mm2 die. Thus, a 100-mm2 die becomes 103 mm2, a 3% increase in area and approximately 6% yielded die-cost differential. By integrating the function on chip, the designer can add measurable savings to pad-limited designs. Just imagine the savings if the DDR3 DRAM interface can be reduced from 32 bits to 16 bits, a reduction of 30 or more pads.
The system architect must be able to easily add and edit I/O constraints to configure the sides and order of the pads as well as and the layer and pitch for the pins. Place and route tools such as Mentor Graphics’ Olympus-SoC provide a port properties editor, which displays I/Os graphically and allows the architect to align and assign sides, layers, and pitches for the I/O pins (Fig. 3). Olympus-SoC can also infer, or derive, constraints based on an existing I/O pad (and partition pin) placement, providing a useful starting place for further constraint editing.
As with the compute engines and memory, the SoC’s complement of peripherals is a tradeoff between marketing goals and engineering realities. Peripherals might include but are not limited to DDR3, Wi-Fi, MIPI, USB, HDMI, and PCI Express. The architect must weigh each function in terms of what user benefit the SoC will deliver given the silicon real estate and the I/O pads each function will consume.
One of the reasons that interfaces in general have been moving from parallel to serial is the need to reduce the number of I/O pads. DDR3 I/Os consume the greatest number of pads, followed by PCI Express and USB, and then HDMI and MIPI. These I/Os constrain the floor plan since they must be optimally located to ensure the highest signal integrity.
By the time the floor plan is completed, the system architect will have made a large number of tradeoffs, each driven by the imperative to deliver the greatest amount of functionality in the smallest silicon area while adding the fewest number of I/O pads. Furthermore, a design that fails to deliver a competitive feature set or does not deliver a competitive battery life could have severe consequences for a company targeting the highest-volume consumer applications.
Each new logic CMOS process generation will deliver twice the number of transistors as the previous generation. The task for marketing and engineering will be to produce SoCs that provide entirely new functionality to address a new market, integrate existing functions to grab a larger share of an existing market, or slash die cost in hope of capturing low-end market share whether due to competitor displacement or low price attracting new customers.
In developing this next-generation chip, the system architect must decide what IP blocks to add that enhance the user experience while maintaining a strict cost and power budgets: compute engines, accelerators, the latest generation peripherals, and/or more memory.
- “Report of Final Project,” UC San Diego Design of SPARC CPU SoC
- “Automatic Placement for Custom Layout in Virtuoso Layout Suite GXL”
- “Design Planning Strategies to Improve Physical Design Flows—Floorplanning and Power Planning”
- “Experience implementing a complex SoC, leveraging a reusable low power specification”
- “Concurrent Hierarchical Design with IC Compiler Real Life Application on Mobile Multi-Media Processor”
- “Real Design Challenges of Low Power Physical Design Implementation”