GF114: GF104 on Steroids
GF114: GF104 on Steroids
In order to create the GTX 560 Ti, NVIDIA took basically the same route as they did with the rest of the 500-series: they refined an existing architecture. In this case, it was the GF104 core that underwent a metamorphosis but the overall changes weren’t as drastic as we saw when transitioning between GF100 and GF110.
GF104 already incorporated the full speed FP16 texture filtering, additional optimizations to its z-cull efficiency and higher levels of configurability of its shared memory and L1 cache. All of these were “missing” from GF100 but added in the GTX 580 and GTX 570. This meant that NVIDIA’s job of converting the GF104 into a new “refresh” line of GF114-based products was relatively simple and straightforward.
Even though the original GF104 incorporated many of the architectural features which were eventually incorporated into the current generation of high-end GPUs, NVIDIA still refined certain aspects to create the GF114. Much like with GF110, one of NVIDIA’s main focuses was to increase performance per watt. So once again the transistor layout was rearranged so more of the faster, higher leakage transistors were placed on the critical rendering paths instead of being used for periphery tasks. Meanwhile, the slower, low leakage transistors were placed where speed wasn’t a primary concern.
Strategically distributing the transistors in this way still allows for a small speed-up in overall rendering performance but since the GF104 was already “fine tuned” in this way, the effect won’t be as apparent as with the GF110.
The differences between the GF110 and GF114 layouts start with the Streaming Multiprocessor which houses the CUDA cores, Texture Units, Polymorph Engine, Warp Schedulers, Load / Store units, SFUs and their associated cache hierarchies. Let’s start at the top and make our way down.
Instead of two Dispatch Units each being accessed by their own Warp Schedulers, the GF114 makes use of a 2:1 ratio between the dispatch units and the schedulers while the number of Special Function Units has doubled per SM. However, the main changes to the SM come with the number of CUDA cores as well as the number of texture units each houses. Instead of the usual 32 cores per SM, the GF114 uses a structure which allows for 48 cores along with 16 load / store units and 8 Special Function Units but the real differences lie with the number of texture units. The GF110 cards have four texture units per SM while the NVIDIA equipped the GF114 with eight TMUs per SM.
Most of you will likely remember that NVIDIA’s move to the GF110 architecture allowed a core with every last SM to be introduced without a significant increase in power consumption. The refinements to GF104 have essentially led the development of its successor down the same path. The GTX 560 Ti -the highest-end SKU containing a GF114 core- will have a total of 384 CUDA cores and 64 texture units, up from the 336 cores and 56 TMUs of the GTX 460 1GB. In addition, the TDP savings from rationalizing the transistor layout allow the core to operate at a higher clock frequency as well.
While the GF114 may not have as many architectural changes as its bigger brothers, it is quite obvious that NVIDIA has focused upon improving the design in several key areas. Naturally, they’re hoping this will be enough to compete head to head against some tough competition.
|Latest Reviews in Featured Reviews|