« Cache size and multi-core | Main | The world needs only 5 computers »

December 06, 2006

Multi core or multi-processor?

I was just reading about the new Intel server quad core processor and AMD's new 'quad core' CPU. Very different approaches and I think there is room for both approaches moving forward.

Intel approach

Intel puts 4 cores on a single chip. These cores use the same cache and bus interface. This type of architecture is taken to an extreme with Suns niagara chips with 8 cores. We can expect processor makers to push the number of cores higher but the design issue will be lower clock speeds, sharing a cache with all the cores and the bus interface between all those cores and the memory.

AMD approach

They just shipped a dual core CPU that is used in pairs. Thus 4 cores. The advantage is each pair of cores has its own cache and its own bus interface and their own memory. This looks and is a NUMA (non uniform memory architecture) design and operating systems and applications potentially need to be tuned to run well on this NUMA architecture. The advantages are two pipes rather than one between the CPU complex and the memory. This doubles memory throughput. The downside is a hypertransport link between both CPUs is used to access memory managed by its twin, hence non uniform.

Summary

Both designs have merit and both will outperform the other in different types of benchmark. Intels approach works better in that current software runs well on this uniform architecture design but it will run out of umph due to smaller cache and limited bus bandwidth. The AMD design requires NUMA support from the operating system and possibly the applications also. But, the benefit is more cache per core as well as double the bandwidth to memory. Which is better? It depends. I hope we see both. I can see the NUMA approach offering higher performance but only with the right kind of application/operating system where NUMA is addressed from an application design perspective. This may be necessary for maximum performance. Operating systems can achieve the same by just assigning processes to the memory of one processor and then only scheduling threads for that process on that processor.

Interesting times...

December 6, 2006 | Permalink

Comments

AMD chips also have a built-in memory controller, whereas intel goes via the chipset to access memory.

AMD do have a real latency advantage, but intel have mitigated this with a larger 4mb cache and improved prefetch in their latest core.

Posted by: jonathan | Dec 7, 2006 10:05:29 AM

Post a comment