Quantcast
 


NVIDIA GeForce GTX 480 Review

Author: Michael "SKYMTL" Hoenig
Date: March 25, 2010
Product Name: Michael "SKYMTL" Hoenig
 
Share |

In-Depth GF 100 Architecture Analysis (Core Layout)


The first stop on this whirlwind tour of the GeForce GF100 is an in-depth look at what makes the GPU tick as we peer into the core layout and how NVIDIA has designed this to be the fastest graphics core on the planet.

Many people incorrectly believed that the Fermi architecture was primarily designed for GPU computing applications and very little thought was given to the graphics processing capabilities. This couldnít be further from the truth since the computing and graphics capabilities were determined in parallel and the result is a brand new architecture tailor made to live in a DX11 environment. Basically, NVIDIA needed to apply what they had learned from past generations (G80 & GT200) to the GF100.


What you are looking at above is the heart and soul of any GF100 card: the core layout. While we will go into each section in a little more detail below, from the overall view we can see that the main functions are broken up into four distinct groups called Graphics Processing Clusters or GPCs which are then broken down again into individual Streaming Multiprocessors (SMs), raster engines and so on. To make matters simple, think of it way: in its highest-end form, a GF100 will have four GPCs, each of which is equipped with four SMs for a total of 16 SMs broken up into groups of four. Within each of these SMs are 32 CUDA Cores (or shader processors from past generations) for a total of 512 cores in total. However, the current GTX 480 and GTX 470 cards make do with slightly less cores (480 and 448 respectively) while we are told there will be a 512 core version in the near future.

On the periphery of the die is the GigaThread Engine along with the memory controllers. The GigaThread Engine performs the somewhat thankless duty of reading the CPUís commands over the host interface and then fetching data from the systemís main memory bank. The data is then copied over onto the framebuffer of the graphics card itself before being passed along to the designated engine within the core. Meanwhile, the GF100 incorporates a total of six 64-bit GDDR5 memory controllers for a total of 384-bits. The massive amount of bandwidth created by a 384-bit GDDR5 memory interface will provide extremely fast access to the system memory and eliminate any bottlenecks seen in past generations.


Each Streaming Multiprocessor holds 32 CUDA cores along with 16 load / store units which allows for a total of 16 threads per clock to be processed. Above these we see Warp Schedulers along with the associated dispatch units which process 32 concurrent threads (called Warps) to the cores.

Finally, closer to the bottom of the SM is the L1 / L2 cache, Polymorph Engine and the four texture units. In total, the maximum number of texture units in this architecture is 64 which should come as a surprise considering the outgoing GT200 architecture supported up to 80 TMUs. However, NVIDIA has implemented a number of improvements with the way the architecture handles textures which we will go into in a later section. Suffice to say that the texture units are now integrated into the SP without having multiple SPs addressing a common texture cache.


Independent of the SM structure is six dedicated partitions of eight ROP units for a total of 48 ROPs as opposed to the 32 units from the GT200 architecture. Also different from the GT200 layout is that instead of backing up directly into the memory bus, the ROPs interface with the shared L2 cache which provides a quick interface for data storage.
 
 
 

Latest Reviews in Video Cards
August 26, 2015
Its small, powerful and a bit costly; AMD's R9 Nano is almost here. In this quick preview we take a look at its impressive specs, potential performance and cutting edge design....
August 24, 2015
After almost two years on the market, we have decided to re-benchmark the performance of two GPUs: the R9 290X and GTX 780 Ti. Both have been at the center of controversy but where do they stand now?...
August 19, 2015
Priced at just $159, NVIDIA's new GTX 950 looks like a great deal but with AMD's R7 370 priced lower and the GTX 960 just $40 more, can its performance meet expectations?...