Remember when thermal analysis meant getting your prototype back and deciding if you might need to throw in a couple of heatsinks and a fan for good measure? Try that approach now and you may find yourself in deep and without a paddle. After all, heat can hamper electrical performance and ultimately reduce mean-time between failures.
Back in my engineering heyday, I never put much thought into thermal analysis because it just wasn’t necessary, and I know I’m not alone. But with semiconductors dissipating greater amounts of power (and therefore heat) per area than ever, coupled with continued system shrinkage over time, more system engineers who don’t perform thermal analysis are winding up in hot water.
“A lot of functions that used to be spread across several components are now contained in a single component,” says Dave Rosato, lead product manager for Ansys. So now, the heat density is much greater for those SoC-type (system-on-a-chip) components.
“The rules of thumb that engineers used to design a board five and 10 years ago just don’t apply to today’s designs,” continues Rosato. “Years ago, the board was ignored as a heat transfer path. Now you must account for all heat transfer paths.”
The “simple solution” is to perform thermal analysis sooner in the design cycle. How soon? At the least, you should perform a rudimentary analysis just after the block diagram stage. You’ll need to download the datasheets for the components you plan to use and get a feel for future challenges from a thermal standpoint.
If that analysis points to potential trouble, you need to consider using some thermalanalysis simulation software and possibly even working with a materials company to determine if it can engineer something that will suit your design parameters.
“DANGER, WILL ROBINSON!”
I own a laptop that recently stopped working because the fan integrated with the heatsink/ heatpipe combination no longer gets powered correctly. Even with the case open and plenty of cool air all around, the unit won’t power up and the “Fan error” message appears before it even performs the typical power-on self-test (POST).
It immediately shuts down when it senses the fan isn’t powered on. The assumption is that the average laptop user won’t pop the case open in a nice air-conditioned room, and thus the CPU will experience the often fatal “thermal runaway.” The downside to this approach is that my entire system is shot because the fan (or the underlying power source to the fan) isn’t working.
This is a good example of a laptop manufacturer deciding that under no circumstances is the CPU to ever run without forced air blowing on the attached heatsink. This design was engineered with these requirements because the laptop designers knew that improper thermal management meant imminent doom. In fact, Intel and AMD take this problem very seriously.
For example, “If the external thermal sensor detects a catastrophic processor temperature of 125°C (maximum), or if the THERMTRIP# signal is asserted, the VCC supply to the processor must be turned off within 500 ms to prevent permanent silicon damage due to thermal runaway of the processor,” says the January 2008 edition of the datasheet for Intel’s Core 2 Duo Processor.
“Maintaining the proper thermal environment is key to reliable, long-term system operation. A complete thermal solution includes both component- and system-level thermal management features,” according to the datasheet.
“To allow for the optimal operation and long-term reliability of Intel processorbased systems, the system/processor thermal solution should be designed so the processor remains within the minimum and maximum junction temperature (TJ) specifications and the corresponding thermal design power (TDP) value,” it notes.
“Caution: operating the processor outside these operating limits may result in permanent damage to the processor and potentially other components in the system,” the datasheet concludes.
Continued on page 2