|by MAC | November 3, 2008|
Microarchitecture Dissected #1
Microarchitecture Dissected #1
In 2007, Intel unveiled the Tick-Tock Model as a demonstration of the company's dedication towards continued rapid technological innovation. The "tick" is a shrinking of the previous architecture manufacturing process (65nm --> 45nm --> 32nm) and the "tock" is a new architecture. Since Penryn was a shrink and slight improvement of the preceding Core architecture, it was time for a brand new architecture and that is where Nehalem comes in. On a side note, the code-name Nehalem first appeared on Intel's long-term roadmap back in 2002, and back then it was claimed to be a future version of the NetBurst architecture used in the Pentium 4. We can all breath a sigh of relief that Intel canned that idea a long ago.
Yorkfield quad-core die on the left, new Nehalem quad-core die on the right.
On the right we have the current Intel "Yorkfield" quad-core die, which is effectively two dual-core dies mounted in one CPU package. On the other hand, Bloomfield-Nehalem is a native quad-core design. We say 'Bloomfield-Nehalem' because the Nehalem architecture was designed to be dynamically scalable and there will be native hexa-core (Westmere) and octo-core (Beckton) models in the future.
Despite the huge and easily visible 12MB L2 cache, the die size of the Yorkfield processors measures a relatively small 214mm². In comparison, the Nehalem clocks in at 263 mm2, a 23% size increase. Although Nehalem may have a larger die size it actually has less transistors then a quad-core Yorkfield (731M vs. 820M). This is despite the fact that both are manufactured using the same 45nm High K + metal gate transistor technology. The reason for this 'anomaly' is the fact that Yorkfield has a lot of extremely transistor-dense L2 cache, while Nehalem has less cache but more components on the die (integrated memory controller, QPI Link, etc).
Now many of you are probably looking at the Nehalem die picture and saying "It is pretty but what am I looking at exactly?". A valid question, so let's take a look at the Nehalem core's layout:
We are not actually going to start critiquing the pros and cons of this core layout (we aren't that knowledgeable), but it is still amazing that all four cores, 256KB L1 cache, 1024KB L2 cache, 8MB L3 cache, one QuickPath interface, and the memory controller are on a single die.
Now as mentioned above, the Nehalem architecture is dynamically scalable, and that is because it was designed with modularity in mind. What this means is that Intel can custom create processors based on the needs of the market without having to go design a brand new chip from scratch. They can add or remove cores, L3 cache, number of QPI links, number of memory channels, type of memory supported, power management, and even integrated graphics. Therefore, Intel now have the ability to add new blocks to the core without having to go to the drawing board and redesigning the whole layout. Amazingly, they are only limited by how much stuff they can actually fit on one CPU package.
In the next page we will examine some of the more functional features that Intel have built into the Core i7 processors.
|Latest Reviews in Processors|