What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

Intel Haswell i7-4770K & i5-4670K Review

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Haswell has been on the minds of many enthusiasts from the moment it crawled onto Intel's roadmaps. This architecture is supposed to serve as a rugged and adaptable platform for both the mobile and desktop spaces over the next two years or so. In other words, Intel has a lot riding on Haswell and they've engineered it in such a way that its benefits will be spread across multiple product categories.

In many Haswell still represents an adherence to Intel’s longstanding ideals of being king of the desktop market. However, as the consumers' needs evolved Intel has been forced to rethink everything from positioning to power consumption, thus affecting the very architecture itself. Build it yourself upgradeable computers are slowly fading into an enthusiast-only segment and ultra portable computing –from tablets to Ultrabooks- is beginning to take over. This has necessitated a shift in ideologies which will ultimately have significant impact upon the viability of certain form factors and Haswell is the first step towards addressing these changing conditions.

INTEL-HASWELL-1.png

With Haswell, Intel’s famous “Tick, Tock” product cycle is taking its next logical step. Unlike the Sandy Bridge / Ivy Bridge transition last year, this time a new architecture is being introduced so many will expect performance improvements which weren’t possible with a simple generational refresh. Naturally, there are some other goodies being thrown in as Intel tries to add a feature set that encompasses the needs of mobile users with the demands of higher performance desktop clients.

The “Tick” part of this equation represents a move towards more advanced process technology using a refresh of existing architecture. You can count this as a proving ground before Intel introduces a new generation of products. For example, Ivy Bridge took the 22nm approach after Sandy Bridge’s 32nm outing.

The Tock meanwhile denotes a new architectural evolution which uses an existing and therefore properly refined manufacturing process. In this case Haswell (the tock of Intel’s current generation equation) uses the 22nm Tri Gate transistors which proved to be the lynchpin in Ivy Bridge’s evolutionary steps forward.

Haswell also incorporates a number of key processing features which will distinguish its design from that of its predecessors. However, this isn’t a new ground-up architectural revision that throws the baby out with the bath water. Rather, it takes many of the Sandy Bridge’s basic design elements and combines them with IPC improvements and new instruction sets tailored for today’s computing environments. The real focus here has been on power consumption, something that has always been a concern for mobile and desktop users alike.

Remaining ahead in the x86 race is still key to Intel’s success. However, in order to achieve the efficiency demanded by certain emerging markets, it has become increasingly necessary to leverage secondary processing means to achieve an optimal balance between performance and power consumption. As we have been hearing from AMD for several generations now, leveraging graphics cores for highly parallelized workloads leads to performance gains without substantial power needs. By using this same approach, Intel has packed additional high performance instruction sets and expanded graphics capabilities alongside their x86 processing modules. When combined, these factors should allow Haswell to remain well ahead of AMD’s APUs in standard processing tasks while partially closing the rather large gap in graphics horsepower.

INTEL-HASWELL-6.png

Even though there will be a wide variety of desktop SKUs, Intel’s goals for Haswell reflect the market’s changing priorities. As such, this is very much a mobile-focused architecture which had the lion’s share of its development time prioritized towards improving battery life, performance per watt and a number of other items in an effort to improve platform efficiency.

On the surface, it may look like Haswell won’t make a noticeable impact in the desktop market. However, the aforementioned improvements have allowed Intel’s engineers to cram additional performance into lower TDP parts. This will allow better throughput at all levels but more importantly, it will lead to the creation of efficient, feature rich quasi-desktops which Intel calls “Adaptive All in One” PCs.

The A-AIO PC is where Intel believes the desktop market is heading this year and into the future. Currently, we’re seeing traditional stationary computing gradually shift towards a more integrated experience with a touch interface replacing the typical mouse / keyboard combination. While there has been a massive amount of pushback against this direction (look no further than Windows 8’s abysmal sales numbers), most consumers want to avoid the clutter and hassle of traditional desktops. Adaptive All in One PCs take this to the next level by combining the performance of a desktop into a thinner form factor that’s somewhat portable so it can be taken from one room of your house to another. For gaming, content creation and other tasks, standard desktops will stick around for the foreseeable future but A-AIO will provide yet another opening for Haswell’s more energy efficient modes.

INTEL-HASWELL-9.jpg

Since Haswell focuses on balancing power consumption with better per-thread performance, it should be the perfect choice for users who want to upgrade from older Core 2 Duo setups. Its platform also incorporates a number of key connectivity enhancements like native SATA 6Gbps, Thunderbolt and USB 3.0 connectivity. However, Sandy Bridge and Ivy Bridge users may find very few reasons to actually upgrade their desktops.

As with the transition from Nehalem to Sandy Bridge, Haswell will require a new socket type (1150) which will necessitate the purchase of a motherboard alongside a CPU. However, the main reason for putting off an upgrade this time around may once again boil down to performance improvements, or a lack thereof.

Even though Haswell incorporates a number of features which will enhance (in some cases dramatically) per-thread performance, expectations have to be tempered since this won’t be backed up by clock speed differences. As we will see on the upcoming pages, the high end 4770K will run at the exact same frequencies as the outgoing 3770K so any boost in benchmark scores will be solely granted through the architectural improvements.

INTEL-HASWELL-2.png
INTEL-HASWELL-8.png

Even after substantial unveilings at IDF 2012 and IDF 2013, today doesn’t actually represent the full launch of Intel’s newest architecture. We’re allowed to talk about quad core parts while the dual core information will have to wait until June 3rd.

So what does this all mean for today’s review? Absolutely nothing since in it we will be looking at two different processors: the i7 4770K and the less expensive yet still extremely capable i5 4670K. With unlocked multipliers and a bevy of other overclocker-friendly features, both of these target the enthusiast market and are priced accordingly at $340 and $240 respectively. The only real difference is the i7 4770K’s ability to process up to eight concurrent threads while its i5 flavored sibling can only access four.

Regardless of your opinion of the desktop market’s current position, Haswell seems to incorporate all of the features one would expect within a modern CPU. Efficiency, graphics processing, extension support and many other areas have been addressed but will this be enough to satisfy everyone?
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Inside the Haswell Architecture

Inside the Tock; Haswell's Architecture


In previous generations, Intel took a relatively modest approach towards evolving their processor architecture. While there were several key components added and improved during the transition between Nehalem and Sandy Bridge, the move from Sandy Bridge / Ivy Bridge to Haswell is being accomplished via what Intel calls “energy efficient performance” directives.

When distilled down into its various components this methodology is allowing Haswell to process more instructions per clock than Sandy Bridge’s Ivy Bridge derivative without an increase in clock speeds or power consumption. This been accomplished through a number of architectural changes, though the basic unified design premise from the previous generation has been carried over nearly untouched.

INTEL-HASWELL-16.jpg

Like Ivy Bridge, Haswell uses Intel’s advanced 22nm Tri-Gate 3D transistor technology, essentially stacking the transistor gates, pipelines and silicon substrate across three dimensions in order to save wafer space. Not only does this allow all of Haswell’s 1.4 billion transistors to be packed into a small 177mm² die but it also ensures that leakage is kept to a minimum, thus boosting efficiency.

There haven’t been any substantial changes to the die layout either. Haswell processors still have up to four x86 processing cores, each of which can run two concurrent threads (provided Hyper Threading is enabled) under the same roof as a Processor Graphics engine, System Agent and Display I/O Engine. Primary PCI-E lanes, video codec engines and the memory controller share the space as well. There is also up to 8MB of L3 cache which is dynamically shared between the processor’s cores and the graphics engine.

INTEL-HASWELL-13.png

Usually, when one thinks of efficiency, power consumption comes to mind but Intel has also been trying to address on-chip efficiency as well. In other words, they set out to improve the communication effectiveness between Haswell’s independent processing stages. This was accomplished through better branch prediction, adding resources for improved single thread performance, optimizing bandwidth in various areas and higher amounts of parallelism….all without doing substantial changes to the architecture’s primary building blocks.

The first step of this process was to add compatibility with AMD’s FMA3 instruction set through a pair of FMA processing units, thus improving double precision performance by about 200% when compared to Ivy Bridge. Support for AVX2 instructions was also added, once again allowing Haswell to process up to 32 single precision FLOPs per cycle versus Ivy Bridge’s 16 FLOPs. On paper one might assume this won’t affect everyday applications but these new instruction sets could lead to a drastic increase in games and HPC-centric workloads.

Intel also added two more ports to Haswell’s URS (Unified Reservation Station), one of which includes a fourth ALU meant to streamline the first two ports’ workflow and a specialized Branch Unit which improves inter-stage communications. The second new port houses a dedicated Address Generation Unit for store operations, leading to port 2 and 3 remaining open primarily for loads.

INTEL-HASWELL-14.png

In order to cope with the potentially deluge of compute performance improvements, Haswell’s back-end caching structure has been given a facelift. The load and store L1 bandwidth has been doubled which prepares the architecture for the higher throughput from the aforementioned AGU. With this taken into account, the communication structure between the L1 and unified L2 caches also needed some shoring up so Intel boosted performance to 64 bytes per cycle while the unified L2 TLB received a significant increase as well.

So what does this mean in plain English? While the organization and size of Haswell’s primary cache structure is identical to the previous generation, the improved bandwidth will help both legacy and new code perform at higher levels without the need for increased clock speeds or major architecture changes.

INTEL-HASWELL-12.png

Haswell’s actual power efficiency features have gone through a few revisions too. Turbo frequencies have been fine tuned by using finer grain voltage adjustments while an optimized CPU to PCH link reduces the amount of power needed for communications between the two chips. Meanwhile, idle power has been substantially improved by incorporating new C-States, manufacturing process optimizations and enhanced power gating when the system is at idle.

With their newest architecture, Intel didn’t need to reinvent the wheel in order to optimize performance. Instead they decided to focus on optimizations which will improve IPC in some key areas while leaning on a mature manufacturing process and effiecient on-chip communications to reduce power consumption. On paper at least, this should lead to a 10-20% improvement over Ivy Bridge.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Haswell’s GT1, GT2 & GT3 Graphics Engines

Haswell’s GT1, GT2 & GT3 Graphics Engines


With the computing market gradually shifting to more 3D content and programs beginning to take advantage of GPU compute, Intel knew it was time to step up their graphics game. Sandy Bridge and Ivy Bridge were a good start since they incorporated advanced processing techniques with an architecture that could play basic games while working quite well in OpenCL environments.

Haswell continues this tradition but in some cases, it takes Intel’s graphics power to a whole new level in an entirely unexpected way. Instead of pairing up higher end processors K-series and mobile MX-series processors with better graphics engines, it will be the mid tier and some low voltage SKUs which will receive them. This is a completely understandable move considering enthusiast-grade CPUs are typically used alongside discrete GPUs.

INTEL-HASWELL-23.jpg

Unlike in previous generations, Haswell’s Processor Graphics will be sectioned off into four distinct categories. At the lowest tier will be the Intel HD Graphics (GT1) which is laid out identically to Sandy Bridge’s architecture, but comes equipped with 10 Execution Units (EUs) / shaders (four more than on Ivy Bridge) plus a single Texture Unit. It will be used for entry level processors.

The HD4600, HD4400 and HD4200 (or GT2) processor graphics will be the most prevalent across a wide variety of CPUs and incorporate 20 EUs alongside two Texture Processors. This represents a slight upgrade over Ivy Bridge but the performance of each Execution Unit has been drastically improved, leading to some significant performance advantages for Haswell.

The real stars of this show will be the Iris 5100 and Iris Pro 5200 graphics processors. These provide a truly next generation approach for Intel by doubling up on the specifications of the GT2 HD4600, offering 40 EUs and a quartet of texture units. Clock speeds will also be a bit higher (though this may change if they're used in a low voltage system) and the Iris Pro 5200 adds up to 128MB of embedded fast access DRAM for improved performance. In some ways, Intel is betting big on Iris in an effort to blunt AMD’s aggressive moves into the mobile space.

All of these new and improved Processor Graphics Engines also incorporate support for DX11.1, OpenCL 1.2 and OpenGL 4.0, potentially opening up a new dimension of performance improvements.

INTEL-HASWELL-17.png

From an architectural perspective, not much has changed between the Processor Graphics of Sandy Bridge to the layout we now have within Haswell. All of the primary processing stages have remained identical and the communication pathways follow a very similar direction.

The front-end houses Geometry Engines (which includes the shaders and other fixed function units) along with a dedicated Command Streamer and setup section while the so-called sub-Slice houses the Shaders / EUs, instruction caches and various other samplers and data interconnects. Finally, the Slice Common includes items like the Rasterizer, L3 cache and Pixel Instruction module which are shared between the sub-Slices. This layout can now be scaled upwards to create additional graphics processor groups.

While similarities between Sandy Bridge and Haswell abound, Intel has incorporated several new elements into their latest design to aid with graphics workloads. First and foremost, the Command Streamer now has a dedicated resource handler which offloads processing calls that would normally be processed by the driver. In order to meet the needs of Iris and Iris Pro, the performance of front end fixed function stages has been doubled, essentially giving the architecture room to grow. Finally, the Texture Sampler within each sub-Slice can now handle four times more throughput. This has all been done in an effort to prepare Haswell’s Processor Graphics for its new scaling capabilities.

INTEL-HASWELL-21.png

As we already mentioned, the Iris 5000-series / GT3 uses substantially more shaders and Texture Units than its predecessors but the underlying architecture of lower-end parts remains the same. Intel simply added more Slices to their HD4000 parts to create a type of Frankenstein graphics processor. This was accomplished by literally grafting more Slices onto the PGU’s back-end fixed function units.

In layman’s terms a “Slice” consists of the sub-Slice and Slice Common and is a rough analog to AMD’s SIMD Array or NVIDIA’s SMX. Basically, it is a grouping of graphics compute elements into a cohesive processing stage which can act independently if need be. As AMD and NVIDIA do when creating new SKUs, the number of Slices can be scaled up and down based on the situation in which the processor finds itself.

INTEL-HASWELL-22.png

Perhaps the most important addition to this graphics architecture is the way Haswell handles ring caching between its various processing stages. Instead of having Processor Graphics tied at the hip to the CPU’s needs, Intel has decoupled the caching hierarchy, allowing the graphics engine to have dedicated pipelines for independent access to the Last Level Cache. This should help improve scheduling, processing speed and data access. This so-called “ring” also helps the flow of data between the other processing stages and is independently overclockable at a level equal to or lesser than the cores.

In every conceivable way, Haswell’s new Processor Graphics Engines are an improvement over the previous generation’s often tepid offering. They include more shaders and higher clock speeds but the crowning achievement is their ability to scale upwards. While it looks like the revised HD4000 parts will post moderate gains, Iris and Iris Pro could help usher in a whole new era for Intel.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
New Video Engine Features

New Video Engine Features


With every new generation, Intel has done everything within their power to ensure their video playback features remain up to date. This has meant a gradual evolution of the ubiquitous Intel Media SDK which improves hardware accelerated high definition media playback alongside the implementation of dedicated processing engines for certain types of content. Haswell moves these elements a few steps further afield by attacking areas where the previous generations were found to be lacking.

INTEL-HASWELL-18.png

One of the newest additions to Haswell is the so-called Video Quality Engine or VQE which serves as a one-stop-shop for video processing functions. Essentially, this engine operates separately from the primary graphics stages but has the capability to add hardware-accelerated post processing effects into HD video content. In previous generations, these functions were processed through software, wasting valuable resources (and battery life) and potentially lowering system responsiveness. The VQE also adds support for Gamut Expansion, Intel’s Skin Tone Tuned Image Enhancement and Image Stabilization.

While Haswell’s Multi Format Codec isn’t necessarily new, some future-proofing features have been added such as hardware acceleration for Scalable Video Coding (SVC), MJPEG decoding and support for native MVC short format. Native support for 4K resolutions has been added as well with the HDMI output ready to handle up to 4096x2304 @ 24Hz while the DisplayPort 1.2 port can go up to 3840x2160 @ 60Hz.

INTEL-HASWELL-20.png

The primary use of these new dedicated video processing stages is to dynamically lower the power consumption of Haswell when it’s processing HD content. These dedicated and highly optimized engines do take advantage of the efficient 22nm manufacturing process increase efficiency but their primary use is to take some of the heavy lifting away from the Processor Graphics. In many cases, they will act in parallel, sharing the video processing load while improving performance and lowering processor usage.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Z87; Today’s Enthusiast Platform

Z87; Today’s Enthusiast Platform


As Intel has gradually moved from one generation to the next, their “Tock” cycle always brings about a platform change. While many rail at the necessity of changing their motherboard when upgrading to a brand new architecture, certain microcode, silicon, feature and package revisions lead to In the case of Haswell, Intel has instituted a socket with 1150 pins, thus making these processors incompatible with past motherboard designs. On the positive side of the coin, 1150 will stick around for next year’s Broadwell so there is a possible upgrade path just over the horizon.

INTEL-HASWELL-26.jpg

A number of different Lynx Point chipsets have been created for 1150, most of which follow closely in the footsteps of Ivy Bridge’s Panther Point, more commonly known as the 7-series. The Z87 chipset will be the flagship product targeting enthusiasts with the B85 coming in at the opposite end of the spectrum due to its focus on delivering a secure environment for business clients. Between these two extremes lies a number of mainstream-oriented chipsets which include the H87, Q87 and Q85. For the purposes of this article, we will be focusing on the Z87.

Before we get too far into this section, it is important to mention that while Z87 does include a number of new features, the differences between it and Z77 aren’t extreme. Rather, Intel’s primary focus was one refinement, fine tuning several key areas to ensure higher levels of efficiency and better connectivity integration. For example, the new AHCI interface uses enhanced link management that incorporates support for native hot plugging. Native Command Queuing has been added as well, improving boot times and multi tasking performance.

INTEL-HASWELL-25.png

Other than some not-so-evident features, Intel has rolled out a number of primary functionality changes into the 8-series motherboards. The most important addition for enthusiasts will likely be the addition of four native SATA 6Gbps ports for a total of six. Enthusiasts will appreciate this since it will allow them to create much larger SSD RAID arrays without having to fall back to SATA 3Gbps or third party interfaces. The number of USB 3.0 ports has also been upped to six but and there’s native support for Thunderbolt, though most motherboard vendors will leave that I/O interface’s needs to a separate controller. More importantly, those USB ports are now controlled through the Exensible Host Controller Interface, providing a high level of efficiency and performance that was unheard of previously.

Intel has finally moved away from legacy PCI support from all of their chipsets. However, this doesn’t mean a complete elimination of PCI slots though since motherboard vendors can still dedicate PCI-E lanes towards dedicated PCI expansion compatibility.

INTEL-HASWELL-24.png

Not much has really changed with the Z87’s interfaces other than the aforementioned I/O upgrades. The Haswell processor still acts as a controller for the main external graphics interface, boasting up to 20 native PCI-E 3.0 lanes, though only 16 can be enabled on consumer chipsets. Some of these can also be dedicated towards native Thunderbolt support but that won’t happen all that often as enthusiast motherboards will likely dedicate all 16 lanes to graphics processing.

Either one external graphics card can communicate with the CPU through 16 dedicated lanes or they can be split into a pair of Gen 3 x8 interfaces (each with the bandwidth of a single Gen 2 x16 slot) for SLI or Crossfire support. Triple Crossfire is also possible though with the PCI-E lanes configured at x8/x4/x4. Alternately, eight of the lanes can be directed towards other purposes like providing bandwidth to a secondary Thunderbolt controller but this will also sacrifice multi GPU compatibility unless a PLX switch is used.

As with Z77, the Processor Graphics communicates with and ultimately outputs its display signals to the PCH via the FDI or Flexible Display Interface. This runs in parallel with the DMI interface, a link between the CPU and the PCH that features four lanes in each direction that can operate at speeds of up to 2 GB/s. This results in 4 GB/s of aggregate bandwidth if both upstream and downstream lanes are used to their theoretical maximum.

Additional PCI-E 2.0 ports are available through the PCH but these are typically used for communications with secondary / third party controllers rather than for graphics.

There are however some slightly minor changes in regards to multiple display outputs and connectivity features. Instead of triple screen support being run through the chipset, all signals are now processed and directed by the onboard processor graphics, eliminating latency and improving display quality. In addition, Intel’s Smart Connect and Rapid Start technology support has been added to the Z86 chipset.

INTEL-HASWELL-27.jpg

For all of the testing in this review we will be using an MSI Z87-G45 Gaming motherboard which represents a mid-tier Haswell companion that is both well priced at just $159 yet comes with a litany of features. Triple Crossfire support via the aforementioned x8/x4/x4 lane configuration (though standard SLI and Crossfire support at x8/x8 has also been included), specialized high precision USB and PS/2 gaming ports and an onboard Killer Ethernet have all been added.

INTEL-HASWELL-29.jpg

By far the most interesting feature here is the integrated mSATA port, a perfect companion for Intel’s new 525-series drives. In most situations this port will likely be used for a cache drive for improved system responsiveness but it can also be used as a primary drive connector.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Intel’s Desktop Haswell Lineup Revealed

Intel’s Quad Core Desktop Haswell Lineup Revealed


For Intel, Haswell represents an opportunity to refresh their entire lineup with new, more efficient offerings. Due to the architectural changes, these new processors will prove to be a boon for the notebook market but the desktop segment certainly hasn’t been sold short.

Initially Intel will be launching no fewer than thirteen different SKUs which are spread across multiple product categories. The K-series will remain atop the stack and target enthusiasts with unlocked multipliers and high clock speeds. Low voltage S and T series processors have been carried over as well with the T-series being specifically geared towards situations which require an ultra low TPD processor.

INTEL-HASWELL-7.jpg

While Intel’s logos have received subtle changes, Haswell’s branding will closely follow the path set by other Core-series products. The i3, i5 and i7 designations have remained in place along with an indication of whether or not a certain SKU supports the vPro feature set. There is a minor exception this time around: at least on the desktop, dual core parts won’t be launching just yet as Ivy Bridge i3 and Pentium processors will remain around to flesh out Intel’s entry level offerings. Some dual core SKUs equipped with Hyper Threading will be available soon but we can’t talk about those just yet.

INTEL-HASWELL-40.jpg

The only real difference between the 2013 Haswell lineup and previous generations is the inclusion of a dedicated R-Series. These will represent a melding of a desktop CPU with an integrated BGA socket which won’t allow for upgrades through a common socket. Rather, BGA processors will be fused directly with their associated motherboard. This has allowed Intel to incorporate both the CPU die and the chipset into a common package, potentially boosting interconnect speeds. This has also has allowed for a consolidation of processing cores, graphics capabilities, primary chipset functions and in some cases eDRAM onto one package.

Another hallmark of these BGA processors will be the incorporation of Iris 5000-series processor graphics. As we saw in the last few pages, Iris brings a massive performance increase to the table, particularly when embedded DRAM is used with the Pro iteration.

For now, these BGA processors focus on the system integrator and small form factor desktop markets but expect Broadwell, Skylake and other architectures to further adopt the BGA package for desktop uses.

INTEL-HASWELL-3.png

Sitting atop Intel’s Haswell lineup is the i7 4770K processor which operates at a base frequency of 3.5GHz with the capability to hit 3.9GHz when the right situations present themselves. These are actually the exact same frequencies as the i7 3770K reached and indeed, many of the specifications you see above closely mirror those found on Ivy Bridge processors. With the exception of the BGA-based i7 4770R, even the 8MB of L3 cache has been carried over to every Haswell 4 core, 8-thread processor.

With a price of about $339 when purchased in 1000 unit quantities, the 4770K also happens to be the most expensive Haswell CPU currently available, despite its lack of vPro, TXT and other advanced features. It does however offer a slightly higher maximum graphics frequency.

Looking at these figures, we’ve gone for more than two years without a substantial increase in processor frequencies. While AMD’s new Richland processors will be easily able to hit the 4GHz mark, Intel’s modern day architectures have thus far struggled to surpass that barrier. The exact reason for this stagnation is up for debate but it seems like Intel’s engineers have been continually erring on the side of efficiency rather than enhanced performance.

The rest of Intel’s i7 4700-series lineup contains non-K branded i7 4770 which operates at a slightly lower base clock but it incorporates vPro and the 4765T, a processor that can process up to eight threads but has a TDP of just 35W. Finally, the low wattage category is rounded out by the i7 4770S and i7 4770T, both of which allow for better performance than the 4765T. In order to meet their strict TDP requirements, all of these low voltage variants operate at substantially lower clock speeds than the 4770K and are thus targeted at completely different markets.

INTEL-HASWELL-4.png

Going slightly down-market we come to the i5 4600-series, a group of quad core processors which ship without Hyper Threading which tend to be substantially less expensive than the 8-threaded i7’s. Once again a K-series SKU is the headliner with the unlocked i5 4670K This $242 CPU operates at nearly the same frequencies as its i7 sibling but it incorporates only 6MB of L3 cache while maintaining the HD4600 processor graphics from higher end products. As with the i7 4770K, it doesn’t come equipped with vPro, TXT or Intel’s SIPP technology.

The other processors in this segment follow the same formula as the higher end bracket with a non K-series CPU mirroring the specifications of the i5 4670K and low voltage T/S products rounding out the lineup. The only things missing from the i5 are an R-series part and an ultra low voltage 35W processor similar to the i7 4765T. As you might expect, all of these are priced substantially lower than i7’s while the 4670K commands a $30 premium over its next closest sibling.

INTEL-HASWELL-5.png

Bringing up the rear of Intel’s Haswell’s lineup is a pair of additional i5 processors, the 4570 and 4750S. These are both quad core CPUs without Hyper Threading which primarily target system integrators and they’ll be soon backstopped by a dual core, quad thread SKU. As with most other desktop parts, they incorporate Intel’s HD4600 processor graphics along with 6MB of L3 cache and support for up to 1600MHz memory speeds.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Test Setups & Methodology

Test Setups & Methodology


For this review, we have prepared a number of different test setups, representing many of the popular platforms at the moment. As much as possible, the test setups feature identical components, memory timings, drivers, etc. Aside from manually selecting memory frequencies and timings, every option in the BIOS was at its default setting.

INTEL-HASWELL-300.jpg

For all of the benchmarks, appropriate lengths are taken to ensure an equal comparison through methodical setup, installation, and testing. The following outlines our testing methodology:

A) Windows is installed using a full format.

B) Chipset drivers and accessory hardware drivers (audio, network, GPU) are installed.

C)To ensure consistent results, a few tweaks are applied to Windows 7 and the NVIDIA control panel:
  • UAC – Disabled
  • Indexing – Disabled
  • Superfetch – Disabled
  • System Protection/Restore – Disabled
  • Problem & Error Reporting – Disabled
  • Remote Desktop/Assistance - Disabled
  • Windows Security Center Alerts – Disabled
  • Windows Defender – Disabled
  • Screensaver – Disabled
  • Power Plan – High Performance
  • V-Sync – Off

D) Windows updates are then completed installing all available updates

E) All programs are installed and then updated.

F) Benchmarks are each run three to eight times, and unless otherwise stated, the results are then averaged.

G) All processors had their energy saving options / c-states enabled
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
System Benchmarks: AIDA64 / Cinebench r11.5

System Benchmarks


In this section, we will be using a combination of synthetic benchmarks which stress the CPU and system in a number of different domains. Most of these tests are easy to acquire or are completely free to use so anyone reading this article can easily repeat our tests on their own systems.

To vary the results as much as possible, we have chosen a selection of benchmarks which focus upon varied instruction sets (SSE, SSE3, 3DNow!, AVX, etc.) and different internal CPU components like the floating point units and general processing stages.



AIDA64 Extreme Edition


AIDA64 uses a suite of benchmarks to determine general performance and has quickly become one of the de facto standards among end users for component comparisons. While it may include a great many tests, we used it for general CPU testing (CPU ZLib / CPU Hash) and floating point benchmarks (FPU VP8 / FPU SinJulia).


CPU ZLib Benchmark

This integer benchmark measures combined CPU and memory subsystem performance through the public ZLib compression library. CPU ZLib test uses only the basic x86 instructions but is nonetheless a good indicator of general system performance.

INTEL-HASWELL-41.jpg



CPU Hash Benchmark

This benchmark measures CPU performance using the SHA1 hashing algorithm defined in the Federal Information Processing Standards Publication 180-3. The code behind this benchmark method is written in Assembly. More importantly, it uses MMX, MMX+/SSE, SSE2, SSSE3, AVX instruction sets, allowing for increased performance on supporting processors.

INTEL-HASWELL-40.jpg

RESULTS: The first results of this review are very much in-line with expectations. While the 4770K and 4670K remain close to their predecessors in a basic x86-based test like Z-Lib, their enhanced AVX instruction sets allow for substantially better performance in the Hash benchmark.



FPU VP8 / SinJulia Benchmarks

AIDA’s FPU VP8 benchmark measures video compression performance using the Google VP8 (WebM) video codec Version 0.9.5 and stresses the floating point unit. The test encodes 1280x720 resolution video frames in 1-pass mode at a bitrate of 8192 kbps with best quality settings. The content of the frames are then generated by the FPU Julia fractal module. The code behind this benchmark method utilizes MMX, SSE2 or SSSE3 instruction set extensions.

Meanwhile, SinJulia measures the extended precision (also known as 80-bit) floating-point performance through the computation of a single frame of a modified "Julia" fractal. The code behind this benchmark method is written in Assembly, and utilizes trigonometric and exponential x87 instructions.


INTEL-HASWELL-42.jpg

RESULTS: Here we see some odd results in the VP8 benchmark but once again, they are easily explainable. Both Haswell processors incorporate huge advances to their Floating Point throughput and instruction set processing efficiency, allowing them to surge ahead of all competitors.



CineBench r11.5 64-bit


The latest benchmark from MAXON, Cinebench R11.5 makes use of all your system's processing power to render a photorealistic 3D scene using various different algorithms to stress all available processor cores. The test scene contains approximately 2,000 objects containing more than 300,000 total polygons and uses sharp and blurred reflections, area lights and shadows, procedural shaders, antialiasing, and much more. This particular benchmarking can measure systems with up to 64 processor threads. The result is given in points (pts). The higher the number, the faster your processor.

INTEL-HASWELL-43.jpg

RESULTS: From a raw processing perspective, the i7 4770K holds a substantial lead over its predecessor while the i5 4670K lines up nearly perfectly with the 3570K. Against AMD processors, the only one that comes remotely close to Intel's newest CPUs is the FX 8350 and even that can't match the i7's performance.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
System Benchmarks: Civ V / PCMark 7

System Benchmarks (pg.2)



Civilization V: Gods & Kings Unit Benchmark


Civilization V includes a number of benchmarks which run on the CPU, GPU or a combination thereof. The Unit Benchmark simulates thousands of units and actions being generated at the same time, stresses multi core CPUs, system memory and GPU We give the non-rendered score below as it is more pertinent to overall CPU performance within the application.

INTEL-HASWELL-45.jpg

RESULTS: Once again Haswell's architectural improvements are able to shine with both CPUs posting great results, equalling the performance of more expensive Sandy Bridge E CPUs and simply leaving AMD's products in the dust.



PCMark 7


PCMark 7 is the latest iteration of Futuremark’s system benchmark franchise. It generates an overall score based upon system performance with all components being stressed in one way or another. The result is posted as a generalized score. We also give the Computation Suite score as it isolates the CPU and memory within a single test, without the influence of other components.

INTEL-HASWELL-46.jpg

INTEL-HASWELL-47.jpg

RESULTS: While the CPUs play a large part in PCMark's performance metric, it looks like the Z87 platform's enhancements are being brought to the forefront in the total score. With that being said, the CPU-centric computational benchmark once again shows both processors putting in an extremely strong showing.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
System Benchmarks: 3DMark (CPU) / WPrime

System Benchmarks (pg.3)



3DMark06 CPU


While 3DMark06 may be a slightly older synthetic benchmark, its CPU test still allows for multi threaded performance evaluations within a gaming environment. It effectively removes the CPU from the equation, generating a CPU-centric score.

INTEL-HASWELL-59.jpg



WPrime


wPrime is a leading multithreaded benchmark for x86 processors that tests your processor performance by calculating square roots with a recursive call of Newton's method for estimating functions, with f(x)=x2-k, where k is the number we're sqrting, until Sgn(f(x)/f'(x)) does not equal that of the previous iteration, starting with an estimation of k/2. It then uses an iterative calling of the estimation method a set amount of times to increase the accuracy of the results. It then confirms that n(k)2=k to ensure the calculation was correct. It repeats this for all numbers from 1 to the requested maximum. This is a highly multi-threaded workload. Below are the scores for the 32M and 1024M benchmarks.

INTEL-HASWELL-49.jpg

INTEL-HASWELL-50.jpg

RESULTS: The 3DMark06 and WPrime scores are headlined by more of the same dominance from the Haswell siblings but there is a small wrench thrown into the works. For whatever reason, the i5 3570K beat the i5 4670K in WPrime's 32M test over and over again (something which shouldn't happen) yet the situation was reversed in the 1024M benchmark. We'll chalk this up to a ghost in the system since it wasn't repeated with the i7 4770K vs 3770K.
 
Last edited:
Top