Quantcast
 


ASUS GeForce GTX 465 1GB Review

Author: Michael "SKYMTL" Hoenig
Date: May 30, 2010
Product Name: ASUS GeForce GTX 465 1GB Voltage Tweak Edition
 
Share |

In-Depth GF 100 Architecture Analysis (Core Layout)


The first stop on the whirlwind tour of the GeForce GF100 is an in-depth look at what makes the GPU tick as we peer into the core layout and how NVIDIA has designed it.

Many people incorrectly believed that the Fermi architecture was primarily designed for GPU computing applications and very little thought was given to the graphics processing capabilities. This couldn’t be further from the truth since the computing and graphics capabilities were determined in parallel and the result is a brand new architecture tailor made to live in a DX11 environment. Basically, NVIDIA needed to apply what they had learned from past generations (G80 & GT200) to the GF100.


What you are looking at above is the heart and soul of any GF100 card: the core layout. While we will go into each section in a little more detail below, from the overall view we can see that the main functions are broken up into four distinct groups called Graphics Processing Clusters or GPCs which are then broken down again into individual Streaming Multiprocessors (SMs), raster engines and so on. To make matters simple, think of it way: in its highest-end form, a GF100 will have four GPCs, each of which is equipped with four SMs for a total of 16 Streaming Multiprocessors broken up into groups of four. Within each of these SMs are 32 CUDA Cores (or shader processors from past generations) for a total of 512 cores in total. However, the current GTX 480 and GTX 470 cards make do with slightly less cores (480 and 448 respectively) while the GTX 465 cuts things down even more.

On the periphery of the die is the GigaThread Engine along with the memory controllers. The GigaThread Engine performs the somewhat thankless duty of reading the CPU’s commands over the host interface and then fetching data from the system’s main memory bank. The data is then copied over onto the framebuffer of the graphics card itself before being passed along to the designated engine within the core. Meanwhile, in its fullest form the GF100 incorporates a total of six 64-bit GDDR5 memory controllers for a total of 384-bits. The massive amount of bandwidth created by a 384-bit GDDR5 memory interface will provide extremely fast access to the system memory and eliminate any bottlenecks seen in past generations.


Each Streaming Multiprocessor holds 32 CUDA cores along with 16 load / store units which allows for a total of 16 threads per clock to be processed. Above these we see Warp Schedulers along with the associated dispatch units that process 32 concurrent threads (called Warps) to the cores.

Finally, closer to the bottom of the SM is the L1 / L2 cache, Polymorph Engine and the four texture units. In total, the maximum number of texture units in this architecture is 64 which should come as a surprise considering the older GT200 architecture supported up to 80 TMUs. However, NVIDIA has implemented a number of improvements with the way the architecture handles textures which we will go into in a later section. Suffice to say that the texture units are now integrated into the SP without having multiple SPs addressing a common texture cache.


Independent of the SM structure is six dedicated partitions of eight ROP units for a total of 48 ROPs as opposed to the 32 units from the GT200 architecture. Also different from the GT200 layout is that instead of backing up directly into the memory bus, the ROPs interface with the shared L2 cache which provides a quick interface for data storage.
 
 
 

Latest Reviews in Video Cards
April 16, 2014
Last week, NVIDIA rolled out their new 337.50 beta driver which promises huge performance increases, particularly for SLI setups.  We go in-depth and reveal its potential....
April 8, 2014
Based on the Hawaii architecture, the R9 295X2 is billed as the fastest graphics card on the planet and with a pair of Hawaii cores at its heart, greatness is certainly within reach....
March 20, 2014
Microsoft has taken the covers off of their DirectX 12 API and it features several surprises. Among them, backwards compatibility with DX11 GPUs....