NVIDIA GeForce GTX 480 Review

Author: Michael "SKYMTL" Hoenig
Date: March 25, 2010
Product Name: Michael "SKYMTL" Hoenig
Share |

In-Depth GF 100 Architecture Analysis (Core Layout)

The first stop on this whirlwind tour of the GeForce GF100 is an in-depth look at what makes the GPU tick as we peer into the core layout and how NVIDIA has designed this to be the fastest graphics core on the planet.

Many people incorrectly believed that the Fermi architecture was primarily designed for GPU computing applications and very little thought was given to the graphics processing capabilities. This couldnít be further from the truth since the computing and graphics capabilities were determined in parallel and the result is a brand new architecture tailor made to live in a DX11 environment. Basically, NVIDIA needed to apply what they had learned from past generations (G80 & GT200) to the GF100.

What you are looking at above is the heart and soul of any GF100 card: the core layout. While we will go into each section in a little more detail below, from the overall view we can see that the main functions are broken up into four distinct groups called Graphics Processing Clusters or GPCs which are then broken down again into individual Streaming Multiprocessors (SMs), raster engines and so on. To make matters simple, think of it way: in its highest-end form, a GF100 will have four GPCs, each of which is equipped with four SMs for a total of 16 SMs broken up into groups of four. Within each of these SMs are 32 CUDA Cores (or shader processors from past generations) for a total of 512 cores in total. However, the current GTX 480 and GTX 470 cards make do with slightly less cores (480 and 448 respectively) while we are told there will be a 512 core version in the near future.

On the periphery of the die is the GigaThread Engine along with the memory controllers. The GigaThread Engine performs the somewhat thankless duty of reading the CPUís commands over the host interface and then fetching data from the systemís main memory bank. The data is then copied over onto the framebuffer of the graphics card itself before being passed along to the designated engine within the core. Meanwhile, the GF100 incorporates a total of six 64-bit GDDR5 memory controllers for a total of 384-bits. The massive amount of bandwidth created by a 384-bit GDDR5 memory interface will provide extremely fast access to the system memory and eliminate any bottlenecks seen in past generations.

Each Streaming Multiprocessor holds 32 CUDA cores along with 16 load / store units which allows for a total of 16 threads per clock to be processed. Above these we see Warp Schedulers along with the associated dispatch units which process 32 concurrent threads (called Warps) to the cores.

Finally, closer to the bottom of the SM is the L1 / L2 cache, Polymorph Engine and the four texture units. In total, the maximum number of texture units in this architecture is 64 which should come as a surprise considering the outgoing GT200 architecture supported up to 80 TMUs. However, NVIDIA has implemented a number of improvements with the way the architecture handles textures which we will go into in a later section. Suffice to say that the texture units are now integrated into the SP without having multiple SPs addressing a common texture cache.

Independent of the SM structure is six dedicated partitions of eight ROP units for a total of 48 ROPs as opposed to the 32 units from the GT200 architecture. Also different from the GT200 layout is that instead of backing up directly into the memory bus, the ROPs interface with the shared L2 cache which provides a quick interface for data storage.

Latest Reviews in Video Cards
November 1, 2017
Enough of the endless unboxings, rumors and presumptions! The GTX 1070 Ti is finally here and it's performance results are impressive to say the least!...
August 14, 2017
After nearly two years of teases, AMD's Vega 64 and Vega 56 have finally arrived. Can these two graphics cards really compete with NVIDIA's Pascal lineup?...
July 30, 2017
AMD has finally revealed almost everything there is to know about RX Vega including its pricing, performance and specifications. Is it a disappointment or everything we were hoping for?...