Haswell’s GT1, GT2 & GT3 Graphics Engines
Haswell’s GT1, GT2 & GT3 Graphics Engines
With the computing market gradually shifting to more 3D content and programs beginning to take advantage of GPU compute, Intel knew it was time to step up their graphics game. Sandy Bridge and Ivy Bridge were a good start since they incorporated advanced processing techniques with an architecture that could play basic games while working quite well in OpenCL environments.
Haswell continues this tradition but in some cases, it takes Intel’s graphics power to a whole new level in an entirely unexpected way. Instead of pairing up higher end processors K-series and mobile MX-series processors with better graphics engines, it will be the mid tier and some low voltage SKUs which will receive them. This is a completely understandable move considering enthusiast-grade CPUs are typically used alongside discrete GPUs.
Unlike in previous generations, Haswell’s Processor Graphics will be sectioned off into four distinct categories. At the lowest tier will be the Intel HD Graphics (GT1) which is laid out identically to Sandy Bridge’s architecture, but comes equipped with 10 Execution Units (EUs) / shaders (four more than on Ivy Bridge) plus a single Texture Unit. It will be used for entry level processors.
The HD4600, HD4400 and HD4200 (or GT2) processor graphics will be the most prevalent across a wide variety of CPUs and incorporate 20 EUs alongside two Texture Processors. This represents a slight upgrade over Ivy Bridge but the performance of each Execution Unit has been drastically improved, leading to some significant performance advantages for Haswell.
The real stars of this show will be the Iris 5100 and Iris Pro 5200 graphics processors. These provide a truly next generation approach for Intel by doubling up on the specifications of the GT2 HD4600, offering 40 EUs and a quartet of texture units. Clock speeds will also be a bit higher (though this may change if they're used in a low voltage system) and the Iris Pro 5200 adds up to 128MB of embedded fast access DRAM for improved performance. In some ways, Intel is betting big on Iris in an effort to blunt AMD’s aggressive moves into the mobile space.
All of these new and improved Processor Graphics Engines also incorporate support for DX11.1, OpenCL 1.2 and OpenGL 4.0, potentially opening up a new dimension of performance improvements.
From an architectural perspective, not much has changed between the Processor Graphics of Sandy Bridge to the layout we now have within Haswell. All of the primary processing stages have remained identical and the communication pathways follow a very similar direction.
The front-end houses Geometry Engines (which includes the shaders and other fixed function units) along with a dedicated Command Streamer and setup section while the so-called sub-Slice houses the Shaders / EUs, instruction caches and various other samplers and data interconnects. Finally, the Slice Common includes items like the Rasterizer, L3 cache and Pixel Instruction module which are shared between the sub-Slices. This layout can now be scaled upwards to create additional graphics processor groups.
While similarities between Sandy Bridge and Haswell abound, Intel has incorporated several new elements into their latest design to aid with graphics workloads. First and foremost, the Command Streamer now has a dedicated resource handler which offloads processing calls that would normally be processed by the driver. In order to meet the needs of Iris and Iris Pro, the performance of front end fixed function stages has been doubled, essentially giving the architecture room to grow. Finally, the Texture Sampler within each sub-Slice can now handle four times more throughput. This has all been done in an effort to prepare Haswell’s Processor Graphics for its new scaling capabilities.
As we already mentioned, the Iris 5000-series / GT3 uses substantially more shaders and Texture Units than its predecessors but the underlying architecture of lower-end parts remains the same. Intel simply added more Slices to their HD4000 parts to create a type of Frankenstein graphics processor. This was accomplished by literally grafting more Slices onto the PGU’s back-end fixed function units.
In layman’s terms a “Slice” consists of the sub-Slice and Slice Common and is a rough analog to AMD’s SIMD Array or NVIDIA’s SMX. Basically, it is a grouping of graphics compute elements into a cohesive processing stage which can act independently if need be. As AMD and NVIDIA do when creating new SKUs, the number of Slices can be scaled up and down based on the situation in which the processor finds itself.
Perhaps the most important addition to this graphics architecture is the way Haswell handles ring caching between its various processing stages. Instead of having Processor Graphics tied at the hip to the CPU’s needs, Intel has decoupled the caching hierarchy, allowing the graphics engine to have dedicated pipelines for independent access to the Last Level Cache. This should help improve scheduling, processing speed and data access. This so-called “ring” also helps the flow of data between the other processing stages and is independently overclockable at a level equal to or lesser than the cores.
In every conceivable way, Haswell’s new Processor Graphics Engines are an improvement over the previous generation’s often tepid offering. They include more shaders and higher clock speeds but the crowning achievement is their ability to scale upwards. While it looks like the revised HD4000 parts will post moderate gains, Iris and Iris Pro could help usher in a whole new era for Intel.
|Latest Reviews in Processors|