Feature Test: Memory Scaling
Back in mid-2003, Intel introduced the 875P 'Canterwood' chipset, which featured the company's first consumer-oriented implementation of the dual-channel memory interface. This configuration has successfully lasted us until the present day. However, with Nehalem Intel have kicked things up a notch, unveiling a triple-channel memory configuration. Teamed up with the new integrated DDR3 memory controller, we have very high expectations for this new combo.
<table align="center" table border="0" bgcolor="#666666" cellpadding="5" cellspacing="1" width="735px"><tr><td align="center" bgcolor="#cc9999" width="130"><b></b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - Triple Channel</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - Dual Channel</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - Single Channel</b></td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>3DMark Vantage: CPU Score</b></td><td align="center" bgcolor="#ececec" width="100">19546</td><td align="center" bgcolor="#ececec" width="100">19513</td><td align="center" bgcolor="#ececec" width="100">19484</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Valve Particle Simulation Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">153</td><td align="center" bgcolor="#ececec" width="100">153</td><td align="center" bgcolor="#ececec" width="100">153</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Cinebench R10 1-CPU</b></td><td align="center" bgcolor="#ececec" width="100">4252</td><td align="center" bgcolor="#ececec" width="100">4240</td><td align="center" bgcolor="#ececec" width="100">4222</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>Cinebench R10 Multi-CPU</b></td><td align="center" bgcolor="#ececec" width="100">18569</td><td align="center" bgcolor="#ececec" width="100">18368</td><td align="center" bgcolor="#ececec" width="100">18284</td><tr><td align="center" bgcolor="#ececec" width="100"><b>x264 HD Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">27.23 fps</td><td align="center" bgcolor="#ececec" width="100">27.21 fps</td><td align="center" bgcolor="#ececec" width="100">27.06 fps</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>WinRAR 3.71 Compression</b></td><td align="center" bgcolor="#ececec" width="100">2:39</td><td align="center" bgcolor="#ececec" width="100">2:44</td><td align="center" bgcolor="#ececec" width="100">3:05</td><tr><td align="center" bgcolor="#ececec" width="100"><b>SuperPI 1M</b></td><td align="center" bgcolor="#ececec" width="100">12.960s</td><td align="center" bgcolor="#ececec" width="100">13.061s</td><td align="center" bgcolor="#ececec" width="100">13.207s</td><tr><td align="center" bgcolor="#ececec" width="100"><b>SuperPI 32M</b></td><td align="center" bgcolor="#ececec" width="100">12:04.755</td><td align="center" bgcolor="#ececec" width="100">12:08.356</td><td align="center" bgcolor="#ececec" width="100">12:57.356</td><tr></table>
We re-ran these benchmarks over and over and over again, because we were quite simply puzzled. However, as you can see, generally speaking Nehalem is certainly not memory bandwidth limited. In fact, the only time we noticed a significant performance difference between single and triple-channel is during WinRAR compression, which is a completely memory bandwidth bound workload. There is also an appreciable difference in SuperPI, but even there the difference is nowhere near as large as what we would expect. This is a significant improvement over the Core microarchitecture, which would be hugely bottlenecked by a single-channel memory interface.
Let's see if the synthetic tests can shed any light on this phenomena:
In their native triple-channel memory configuration, the Core i7 processors produce some extremely impressive bandwidth numbers. In fact, the results demonstrate that at DDR3-1066 the Core i7 has nearly 100% more memory bandwidth than the previous Core 2 architecture running at dual-channel DDR2-1066. Even in dual-channel mode, the Core i7 series have almost 90% more bandwidth than previous processors. In single-channel mode, bandwidth drops precipitously, but it still equivalent to a dual-channel DDR2 interface. Clearly, Intel have done a tremendous designing Nehalem's memory architecture.
On the latency front, we also see some tremendous improvements. The integrated memory controller and triple-channel memory combine to produce the lowest latency numbers that we have ever seen. Interestingly, there is clearly a latency hit when utilizing the triple-channel mode since in dual-channel mode latency is 15% lower.
What can we say? From a real-life perspective, the triple-channel memory configuration does not really appear to be necessary for the consumer market. Having said that, our sample of benchmarking applications was fairly limited for this test, so it is too early to make a definitive judgement. However, when you consider the mighty Everest results, it is clear that the triple-channel DDR3 memory and integrated memory controller are a tremendously potent combination, and we look forward to seeing some insane new DDR3 bandwidth world records in the coming weeks.
Feature Test: Memory Scaling
Back in mid-2003, Intel introduced the 875P 'Canterwood' chipset, which featured the company's first consumer-oriented implementation of the dual-channel memory interface. This configuration has successfully lasted us until the present day. However, with Nehalem Intel have kicked things up a notch, unveiling a triple-channel memory configuration. Teamed up with the new integrated DDR3 memory controller, we have very high expectations for this new combo.
<table align="center" table border="0" bgcolor="#666666" cellpadding="5" cellspacing="1" width="735px"><tr><td align="center" bgcolor="#cc9999" width="130"><b></b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - Triple Channel</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - Dual Channel</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - Single Channel</b></td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>3DMark Vantage: CPU Score</b></td><td align="center" bgcolor="#ececec" width="100">19546</td><td align="center" bgcolor="#ececec" width="100">19513</td><td align="center" bgcolor="#ececec" width="100">19484</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Valve Particle Simulation Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">153</td><td align="center" bgcolor="#ececec" width="100">153</td><td align="center" bgcolor="#ececec" width="100">153</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Cinebench R10 1-CPU</b></td><td align="center" bgcolor="#ececec" width="100">4252</td><td align="center" bgcolor="#ececec" width="100">4240</td><td align="center" bgcolor="#ececec" width="100">4222</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>Cinebench R10 Multi-CPU</b></td><td align="center" bgcolor="#ececec" width="100">18569</td><td align="center" bgcolor="#ececec" width="100">18368</td><td align="center" bgcolor="#ececec" width="100">18284</td><tr><td align="center" bgcolor="#ececec" width="100"><b>x264 HD Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">27.23 fps</td><td align="center" bgcolor="#ececec" width="100">27.21 fps</td><td align="center" bgcolor="#ececec" width="100">27.06 fps</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>WinRAR 3.71 Compression</b></td><td align="center" bgcolor="#ececec" width="100">2:39</td><td align="center" bgcolor="#ececec" width="100">2:44</td><td align="center" bgcolor="#ececec" width="100">3:05</td><tr><td align="center" bgcolor="#ececec" width="100"><b>SuperPI 1M</b></td><td align="center" bgcolor="#ececec" width="100">12.960s</td><td align="center" bgcolor="#ececec" width="100">13.061s</td><td align="center" bgcolor="#ececec" width="100">13.207s</td><tr><td align="center" bgcolor="#ececec" width="100"><b>SuperPI 32M</b></td><td align="center" bgcolor="#ececec" width="100">12:04.755</td><td align="center" bgcolor="#ececec" width="100">12:08.356</td><td align="center" bgcolor="#ececec" width="100">12:57.356</td><tr></table>
We re-ran these benchmarks over and over and over again, because we were quite simply puzzled. However, as you can see, generally speaking Nehalem is certainly not memory bandwidth limited. In fact, the only time we noticed a significant performance difference between single and triple-channel is during WinRAR compression, which is a completely memory bandwidth bound workload. There is also an appreciable difference in SuperPI, but even there the difference is nowhere near as large as what we would expect. This is a significant improvement over the Core microarchitecture, which would be hugely bottlenecked by a single-channel memory interface.
Let's see if the synthetic tests can shed any light on this phenomena:
In their native triple-channel memory configuration, the Core i7 processors produce some extremely impressive bandwidth numbers. In fact, the results demonstrate that at DDR3-1066 the Core i7 has nearly 100% more memory bandwidth than the previous Core 2 architecture running at dual-channel DDR2-1066. Even in dual-channel mode, the Core i7 series have almost 90% more bandwidth than previous processors. In single-channel mode, bandwidth drops precipitously, but it still equivalent to a dual-channel DDR2 interface. Clearly, Intel have done a tremendous designing Nehalem's memory architecture.
On the latency front, we also see some tremendous improvements. The integrated memory controller and triple-channel memory combine to produce the lowest latency numbers that we have ever seen. Interestingly, there is clearly a latency hit when utilizing the triple-channel mode since in dual-channel mode latency is 15% lower.
What can we say? From a real-life perspective, the triple-channel memory configuration does not really appear to be necessary for the consumer market. Having said that, our sample of benchmarking applications was fairly limited for this test, so it is too early to make a definitive judgement. However, when you consider the mighty Everest results, it is clear that the triple-channel DDR3 memory and integrated memory controller are a tremendously potent combination, and we look forward to seeing some insane new DDR3 bandwidth world records in the coming weeks.
Last edited: