Multi-chip modules (MCM) are electronic packages that contain multiple integrated circuits (ICs), semiconductor dies, and other discrete components on a single substrate (Fig. 1). These one-component, multiple-function packages provide a number of advantages, including simpler installation and less space required on a printed-circuit board (PCB).
MCM packaging is used inside many higher-performance server systems. For example, the Hitachi MP6000 has 20 chips with some chips dissipating nearly 600 W for a total maximum power dissipation of 6.5 kW or 100 W/cm² (Fig. 2).
In the IBM System z9, the MCM is considered the heart of the server. It includes all the processor chips and L2 cache memory. It also features a 104-layer glass ceramic carrier bearing a total of 16 chips including single-core and dual-core processor chips. Eight chips in the MCM dissipate 640 W, and the total power dissipation is nearly 1 kW.
As can be expected, there are considerable thermal and mechanical challenges in grouping multiple high-power chips in a single MCM. These can include achieving and maintaining the thermal gaps due to the close proximity, non-coplanarity, and tilts of the multiple chips, chip, and capacitor rework. They also can include sealing the MCM to prevent dry-out of the thermal paste, corrosion of the controlled collapse chip connections (C4s) or flip chips, and maintaining the package’s mechanical integrity during the assembly process and operating life.
Three thermal management techniques are generally used to cool MCMs: a thermal conduction module with direct solder attach cooling (DiSAC), a dual-layer thermal interface (TIM) design, and an MCM design with small gap technology (SGT) and a hermetic seal.
DiSAC Thermal Conduction Module
The chip is in physical contact with a water-cooled jacket in the thermal conduction module. It connects to a cooling jacket by solder, i.e., via the DiSAC method. In earlier generations, the chips were mechanically separated from the heatsink by thermal grease and micro fins to reduce the stress on the C4s.
The DiSAC design causes nearly all the load induced by module distortion to be supported by the C4. Because all of the components are connected to solid materials with different thermal expansion coefficients, there is an increase in strain. As a result, C4 lifespan is estimated to be quite short.
With the help of finite element analysis and experimental work, the MCM structure can be adjusted to reduce strain and improve the assembly process to reduce defects. These changes include reducing the contact area of the solder attachment, which makes the temperature on the chip more uniform, and additional C4 connections, which increase the footprint.
In addition to the added number of C4 connections, the outer C4 connections are reinforced. Also, the micro carrier (MCC) was changed from glass/copper to a tungsten structure, and the height of the MCC was increased (Fig. 3).
Dual-Layer TIM Thermal Design
The dual-layer TIM design is physically similar to the thermal conduction module, but does not use solder between the top of the chip and the jacket. Instead, the dual-layer design uses two different interface materials in combination with a heat spreader material.
The MCM design discussed here uses four high-power chips, each with two processors and integrated L2 cache. This results in a highly non-uniform power distribution, with regions exceeding 100 W/cm². To address the high power density, each chip uses an individual heat spreader bonded with adhesive. The thermal resistance of the bond line is minimized by using a very thin layer of thermally conductive adhesive, with an effective conductivity of 1.23 W/m•K.
A silicon-carbide (SiC) heat spreader is the best choice as it has a coefficient of thermal expansion (CTE) similar to that of the chips and a thermal conductivity of 275 W/m•K. The use of matched CTEs prevents thermal stress problems when the module heats up. The heat spreader is optimized by its own thickness as well as the chip spacing. Each heat spreader is individually attached to maintain thin bond lines. The heat spreader is coupled to the copper hat via an adhesive thermal compound (Fig. 4).
Unlike solutions that only use an ATC, using the adhesive thermal interface, SiC heat spreader, and adhesive thermal compound (ATC) significantly reduces the thermal resistance of the ATC layer by distributing the heat over the spreader. The heat spreaders have more than twice the area of the chips. The combined thermal resistance of the composite structure is less than the ATC alone.
MCM + SGT Techniques With Hermetic Seal
The last of the MCMs discussed here are cooled using a thermal paste in combination with SGT with a hermetic seal (Fig. 5). The non-silicone and oil-based thrmal paste minimizes contamination concerns during chip rework.
The SGT design uses soldered pistons in the copper hat. The pistons are located over the higher-power chips. The paste gap between the chip and piston can be individually customized to a required level by reflowing the pistons during the assembly. The high thermal conductivity of the piston and cap allows effective spreading of the heat before it is conducted to a modular refrigeration unit (MRU).
After the pistons have been reflowed, the parts are removed and the effective ATC gap is measured to verify that the hat meets the required specifications. Thereafter, the hat is machined before the MRU is attached. The MRU uses a thin layer of oil as an interface material.
The MCM approach requires its chips to be able to be reworked and replaced if they’re found to be electrically defective. Therefore, the chips are not underfilled. Even without any underfill, the well-matched coefficients of expansion of the glass ceramic substrate and the silicon chip allow for the required fatigue life of the C4 connections. The C4s are then exposed to the ambient, and corrosion can occur. An additional concern is the drying and associated performance loss of the ATC when it’s exposed to the ambient environment.
To mitigate both the C4 corrosion and paste drying concerns, a hermetic seal is achieved by a C-shaped cross-section ring inserted between the substrate and hat. A thin polymer cushion that couples the carrier to the steel base plate supports the C-ring force.
Complex design, encapsulation, and some measurement techniques are required for cooling MCMs that dissipate significant power levels. An MCM’s reliability depends not only on the effective junction temperature of the individual chips, but also on the mechanical and thermally induced mechanical strain. Even with sealing, thermal paste TIMs remain susceptible to degradation over the life of a product. The mechanism of thermal degradation is the apparent separation of the oil from the filler matrix.