AMD Phenom X4 9750 Quad Core CPU Review | ||
| by AkG | August 26, 2008 | ||
| Up Close and Personal Up Close and PersonalFor a good portion of 2007 AMD were evangelizing to their faithful about the wonders the K10 architecture would bring “really soon now” and how it would bring a true “native” quad core to the desktop. Before we get into the nitty-gritty of what makes a Phenom, a Phenom lets dispel the myth that AMD were bull headed about not putting out a cut ‘n’ paste job like the wonderful Intel Q6600 was (in that it was nothing more than two dual cores crosslinked and put unto one chip, but because it was the FIRST quad core on the market it was by default the best). For a long while now AMD has had an integrated memory controller built right unto the die. This made for some amazingly low latency and efficiency levels, but the down side is to update the controller to take advantage of more advanced speeds (e.g. DDR2-1066) AMD had to update the whole CPU. Worse still from a quad core point of view is they couldn’t just slap two X2s together and call it an X4, as the two memory controllers wouldn’t play nice with each other. The additional engineering resources (and an external North Bridge) allowed Intel the ability to do this, but as AMD is a smaller company, it would more than likely have meant pulling people working on their native quad core. Should they have done it anyways? This is open to debate but it is too late now to worry about spilled milk. As with the TLB errata, which has been beaten to death, we are just going to give you some of the important highlights on what makes the K10 (Phenom) a superior architecture when compared to the older and venerable K8 (X2). The biggest most obvious change is in the shear size of the K10; the new Phenom CPU weighs in at a huge 463 million transistors, and is a 11 metal layer chip. This does add to the complexity of manufacturing the Phenom but does make it a solid feeling chip in ones hands, one worthy of respect. 463 million transistors may seem like a lot but it still is less than Intel is using, of course most of those extra transistors in the Intel quads are taken up by the massive amount of cache and don’t really count. AMD knows they cannot compete against Intel in the onboard cache arena as Intel has more cash to spend (pardon the pun) and can simply outspend them on die shrinks (which frees up room for more and more cache for example). When you get beyond transistor counts and chip complexity, the big thing to remember is that when AMD’s engineers sat down to design the K10 they didn’t throw the baby out with the bathwater so to speak and the X2s and X4s share some features and are still very different CPUs. For example, both the older X2s and newer Phenoms both have 128KB of L1 cache. Some X2s only have 512Kb of L2 cache, while others have 1024Kb; this was very confusing for consumers as you could easily get into a situation where two X2s running at the same speed were rated differently. To avoid confusion all Phenoms only come with 512Kb. The biggest difference when it comes to cache is the inclusion of relatively slow 2MB of L3 cache. ![]() One great improvement that was sorely needed was updating the HyperTransport protocol to version 3. Unlike HT 2 which operated at 5x multiplier of the 200mhz clock speed; HT 3.0 runs at either 9x multiplier up and down (though it is also referred to as "18x" as its 9x up and 9x down at the same time) or 10x multiplier (for twice the speed of HT2). As we said this extra boost in performance was sorely needed as anyone who tried sticking a Phenom in an older HT 2 compatible only motherboard knows, the 4 cores become starved for data under HT2 bandwidth restrictions! In the case of the 9750 which we are reviewing it uses the 18x multiplier as the 20x is reserved for even faster models. As mentioned earlier the “Northbridge” or memory controller is built right on the die and once again AMD took the opportunity to improve its performance by giving it native support for DD2-1066. Also unlike the older K8 which had a single 128 bit wide controller, the K10s have two 64 bit wide controllers which can either be “ganged” together to make a virtual 128bit controller or you can leave them separate and operate independently of one another. Another great improvement over the older architecture is SSE128 support. Unlike the older K8 architecture which could handle two 64bit SSE instructions per cycle the newer K10 can handle two 128bit SSE instructions at a time. On the surface this doesn’t sound like a big deal but you have to realize that when an older X2 is faced with a SSE128 instruction set it first has to decode it into two 64bit chunks and then run it. To put it another way not only can the K10 handle more data its is more efficient in how it handles it. SSE instructions are very prevalent in resource intensive / time constrained processes (like video encoding/ decoding for example) as they make things so much more efficient and quicker; and the inability to handle the newer and larger 128 instruction set was a huge handicap for the older X2s when compared to Intel’s C2D architecture. This is one area AMD had to step up and improve upon or be relegated to low end systems only. This on the surface is just one of the numerous changes that AMD did with regards to data capabilities, and some of the others include doubling the instruction fetch size for 16bytes per cycle to 32 bytes which when combined should result in one slick, fast and (hopefully) competitive CPU. In general the K10 is an extremely well designed architecture that unfortunately only catches up with Intel’s older C2D quad line. Unlike when the X2s launched and devastated the competition (performance wise) the Phenom is more about evening things up than trying to surpass Intel. This is actually a good thing as Intel really did pull an Israeli rabbit out of their hat with the C2D. As we said earlier AMD is a fighter and when they needed to pull off an engineering miracle they dug in took their hits and came out with an awfully impressive design which should only get better with time and a die shrink. | ||
| |
| Latest Reviews in Processors | |||||||||
|