Part 2: Heatsink Design

Specification

The methods and code from Part 1 were used to find the maximum steady-state temperature of a CPU with various heatsink designs. The (2cm)2 CPU outputs 50W. The aluminium heatsink was constrained to a (4cm)3 block above it. A fan was assumed to keep all air 0.1mm from the heatsink at 25°C. Heat transfer was by conduction, given thermal conductivity of aluminium and air. A standard design specification was used, defined by a 2D cross-section. Each line of the specification file gave a solid rectangle (in the format x1 y1 x2 y2 -- bottom-left and top-right corners). The other dimension was constant (extruded). Fins and gaps were allowed a minimum width of 2mm.

The supplied sample design was defined by just three blocks, as in the following image (but solid of course, not hollow).

Designs

The designs will be introduced along with visualisation and summary of their results. Detailed results are given in the results section.

The original design had a maximum temperature of 68.1°C. It also had a fairly small temperature range across its surface at steady state; see the cross section through its midpoint:

The next design tried had the same 10mm base, but with 5 fins of width 4mm, and 4mm gaps. The following slice plot shows that it had a much wider range of temperature than the original blocked design, and the maximum temperature (above the CPU) was much lower at 43.4°C.

It is not obvious whether reducing the size of the heatsink base would be beneficial. It would increase the surface area, but could perhaps reduce the capacity for heat flow to the outer fins. This was tested by reducing the base size of the last design from 10mm to 4mm. The maximum temperature was only 0.2°C lower than the previous case.

Taking the finned approach to its extreme, a design with 2mm fins, 2mm gaps and a 2mm base was tested. Again the CPU temperature was much reduced at 39.2°C, as we would expect from the approximate doubling of surface area.

The isosurface representing a temperature of 32°C was visualised as follows. It has a similar profile to the natural uniform case of Part 1:

Observe (in the last figure) that heat is not distributed laterally (across fins) as well as it is along the fins. The outer fins were practically irrelevant. On that basis a new design which has horizontal "fins" and well as vertical -- a grid cell structure -- was tested. The maximum temperature found was indeed lower, at 37.4°C. In the following slice plot temperature can be seen to be reasonably well distributed laterally.

An isosurface of 32°C was taken, equivalent to the one for previous design. Clearly the temperature profile is now closer to the source and is less constrained to the fins.

For a more detailed look, the isosurface of 34°C was also taken, this being in a much smaller region. It shows the lateral connectivity between fins (only one rung up in this case). Note that the aspect ratio of the figure is misleading -- the base of the surface is roughly square over the CPU.

An obvious problem with the cell design is that it would be difficult for a fan to keep all the air in the "tubes" at a constant temperature, an assumption which is much more likely to be valid in the finned case. Here the simplifying assumptions start to break down and we would have to move to a better model to evaluate more efficient designs.

Results

The results for the original design on a 803 grid matched those of another student to within 1°C, the difference being consistent with doing more iterations.

The 'residual' here represents the relaxation step size when the simulation ended. It is the sum of absolute values of grid cell differences. This clearly depends on grid size -- it may be that there is very little finite-scaling error between grids of 203, 403 and 803, since the blocked structure is represented fully by the coarse finite difference. Rather the differences below may be due to premature stopping on the coarser grids.

Table 1. Heatsink Simulation Results.
design grid size max temperature (°C) residual
10mm blocks (original) 403 67.20 1.0x10-4
803 68.10 2.6x10-4
4mm fins, 10mm base 403 42.91 1.0x10-4
803 43.42 2.2x10-3
4mm fins, 4mm base 403 42.81 1.0x10-4
803 43.20 1.8x10-4
2mm fins, 2mm base 403 39.07 1.0x10-4
803 39.24 1.0x10-4
2mm cells 403 37.34 1.0x10-4
803 37.43 1.0x10-4

Simulations were run until the residual (step size) reached 1.0x10-4, or to an iteration threshold that was reasonably close to that level. The time for a single iteration depended on grid size and volume of aluminium in the design. The number of iterations to convergence seemed to depend largely on the volume or thickness of aluminium. 803 grids below were actually run on 4 CPUs in parallel, because of technical problems with running on a single CPU for longer than 20 minutes.

Table 2. Heatsink Simulation Timing.
design grid size residual iterations CPU time (mm:ss)
10mm blocks (original) 403 1.0x10-4 29384 01:11
803 2.6x10-4 100000 36:55
4mm fins, 10mm base 403 1.0x10-4 18232 01:00
803 2.2x10-3 50000 26:47
4mm fins, 4mm base 403 1.0x10-4 15726 00:47
803 1.8x10-4 50000 24:00
2mm fins, 2mm base 403 1.0x10-4 10981 00:32
803 1.0x10-4 32842 17:52
2mm cells 403 1.0x10-4 12557 00:51
803 1.0x10-4 39371 27:13

Parallelisation

The same red-black updating method was used as in Part 1. The grid was allocated along the uniform dimension, and each CPU wrote its output to a separate file, which were then reassembled in Matlab. The maximum temperature, residual and visualisations were compared with the serial case to ensure consistency.

As can be seen from the table below, parallel efficiency is even better than ideal! This is probably due to better cache use on each processor with the smaller arrays. The serial time here was extrapolated from fewer (10000) iterations.

Table 3. Parallel Efficiency for 2mm finned design, 803 grid.
CPUs CPU time elapsed time speedup
1 45:13 45:13
4 27:13 07:05 6.38
8 30:00 03:53 11.64

Code

heatsink.f90 Main program.
sor_iterate.f90 Code for doing 3D SOR in serial and parallel.
globals.f90 Configuration, Constants, Variable declarations.
infin.f90 Initial conditions, output.



Felix Andrews