What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

NVIDIA’s GeForce GF100 Under the Microscope

Status
Not open for further replies.

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Using the Compute Architecture for Gaming

Using the Compute Architecture for Gaming


When NVIDIA made CUDA available back at the beginning of 2007, the GPU computing sector was still very much in its infancy. Now, nearly three years later the trend has caught on and many companies are looking into this application of GPUs for a faster, more efficient means to process many of their computational tasks. This technology has gradually made its way into gaming as well and a GPU can be used to calculate anything from character physics to the way NPCs interact with their environment.

GF100-27.jpg

With the dawn of DX11, we have seen the introduction of Direct Compute’s Compute Shaders which offer a way for advanced image processing to be done or highly intelligent AI to be implemented with a minimum of resources. There has also been a lot of talk about the exposure of OpenCL in the market with several companies actively developing physics APIs and other programs that will use this open language to harness the massively parallel architecture of modern GPUs. NVIDIA’s first goal of increasing the number of applications which use GPUs from a compute perspective was a success. They now have to follow that up with continued support for all GPGPU standards.

Even though there has been a lot of discussion about NVIDIA’s trumpeting of their own proprietary PhysX engine, we have to remember they actually support open programming languages such as OpenCL in addition to DX11’s DirectCompute. Indeed, they are actively supporting and helping Bullet to debug their OpenCL physics.

GF100-28.jpg

DX11’s DirectCompute in and of itself is a topic that we can talk about for ages but for the sake of clarity, let’s just say that it allows developers to add new features into games without having a massive performance impact. Additional image processing is possible with Depth of Field (seen above) and custom blurring while it is also possible to include hybrid rendering for shadow maps, OIT (Order Independent Transparency) and other items can give increased realism to scenes as well.

GF100-29.jpg

Using the compute power of the GPU also allows for additional animations such as the ones we saw earlier with the hair and water demos alongside physics as well. The result can be anything from realistic particle movement to banners whipping in the wind.

GF100-30.jpg

Dark Void is a game where many of these concepts will come to fruition but in this case things are helped along by NVIDIA’s PhysX. The demo we saw was nothing short of outstanding with sparks behaving in an eerily realistic way by bouncing off surfaces before combining on the ground and NPC flailing around as they are picked up and thrown. Even the main character’s jetpack trails volumetric smoke that interacts with its environment. This wouldn’t be possible without the use of high-level physics processing and ended up being a stunning example of in-game physics. We highly recommend you try out the demo, especially if you have an NVIDIA card.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Compute Performance on the GF100

Compute Performance on the GF100


We have already talked over and over again about the inherent efficiency that comes with the incorporation of several new technologies into the GF100 but its strength in GPU Compute applications stems from its HPC roots. When it comes to processing large parallel data sets nothing distinguishes the GF100 more than its ability to run concurrent kernels.

GTC-21.jpg

The GigaThread hardware thread scheduler is the piece of the puzzle that allows for concurrent kernel execution. In a serial kernel execution scenario, each kernel has to wait for the one before it to finish before it can begin. However, with concurrent kernels the GPU can be utilized in a more efficient way since different kernels of the same application context can operate at the same time. Basically, the use of concurrent kernels means that multiple work loads can be implemented at the same time which not only frees up resources but also allows for quicker processing of things like in-game physics and AI.

NVIDIA has stated that their PhysX 3.0 update will take advantage of concurrent kernels.

GF100-32.jpg

This all boils down to a significant increase in compute performance over the previous generation of cards. We also have to remember that this will not only benefit game features but will have a significant impact upon Folding@home GPU performance as well.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Performance Preview (Far Cry 2)

Performance Preview


Now that we have extensively gone through the architectural aspects and feature set of the GF100, it’s time to see what happens when the rubber hits the road. Up to this point, benchmarks involving NVIDIA’s upcoming cards were impossible to come by which is understandable since any leaks would have opened the competition’s eyes as to what was slowly but steadily approaching them. While the following benchmarks show only a slight glimpse of what the GF100 is capable of, it should be mentioned that all of them were run in front of our eyes in real-time and were done on hardware that is still in its beta stage. These numbers will improve as NVIDIA dials in clock speeds and the beta drivers mature to a point where they are ready for release.


PLEASE READ:

As you look at the charts below, you will notice that we added in “simulated” HD 5870 1GB results and there is a story behind this. NVIDIA was very forward with us and revealed the specifications of the test system they were running: a stock Intel i7 960, 6GB of 1600Mhz memory and an ASUS Rampage II Extreme motherboard running Windows 7. While we weren’t able to copy the exact same system, we did our best to replicate it by using the exact same CPU and OS while estimating the memory latencies and using a Gigabyte X58-UD5 motherboard. The result? Our stock GTX 285 performed within 2% of the results NVIDIA showed us so we are confident in the accuracy of our results.


Far Cry 2 DX10


While Far Cry 2 was released some time ago, it is still considered an extremely demanding game, especially at the Ultra High DX10 settings NVIDIA was using. They used the Ranch Small built-in timedemo which does tend to give a good approximation of the performance within the game itself. These tests were run on an NVIDIA demo system but the GTX 285 results were backed up by benchmarks we ran on our own test system (see above) and include a standard boot, setup of options and running of Far Cry 2's "Small Ranch" benchmark. AI was enabled as well.


1920 x 1200

GF100-42.jpg

The first indications seem to be that NVIDIA is definitely on the right track with the GF100 since it performs far above and beyond the older GTX 285. When you look at it in comparison to the current single GPU champ –the HD 5870 1GB – there just isn’t any competition at all and we are positive that even a highly overclocked HD 5870 won’t come anywhere near the GF100 we were shown.


2560 x 1600

GF100-43.jpg

In the past, we found that performance at this resolution in Far Cry 2 DX10 resulted in a fair amount of framebuffer limitation. However, NVIDIA has worked hard to make sure as much of the graphics processing data stays on-die within the cache before finally being handed off to the memory, which will carry huge benefits in games like this one.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Performance Preview (Dark Void)

Dark Void


Dark Void is an upcoming shooter which not only looks great but also takes advantage of some pretty wild PhysX effects. While we will be taking a closer look at this game and its performance at a later date, NVIDIA wanted to show it off in front of journalists prior to its release. For the results you see below, NVIDIA used the built-in benchmark that makes heavy use of in-game physics and the engine’s long draw distances.

GF100-44.jpg

This is one test where the GPU compute roots of the GF100 can really come into play by efficiently processing PhysX and rendering the scene at the same time. Performance is once again far beyond anything the GTX 285 can accomplish which should make things interesting come release.



Performance: Early Impressions


While it was good to finally see some concrete results from the GF100 / Fermi architecture, there are a number of questions that are still bouncing around inside our heads. First and foremost, exactly what type of card was within the test systems since when asked, NVIDIA merely stated that it was the card that would be shipping on launch day. If some people are to be believed, due to yield issues the only card we would see on launch day would be a slightly cut-down 448SP version. Our contacts further cemented this by saying that the product which is first available will be largely dictated by the yields coming out of the TSMC foundries.

Naturally the cards we benchmarked weren’t equipped with anything above 512SPs since that is the maximum layout this architecture will allow. If we assume the performance we saw was coming out of the beta, underclocked version of a 512SP GF100 running alpha-stage drivers, this is going to be one hell of a graphics card. On the other hand, if NVIDIA was using 448SP equipped cards for these tests, the true potential of the GF100 is simply mind-boggling. Coupled with robust compute power and an architecture specifically designed for the rigors of a DX11 environment, it could be a gamer’s wet dream come true.

While the glimpse into the GF100’s performance window was narrow, NVIDIA showed enough to get us excited to see the final product. Things in the GPU arena are sure to heat up come spring.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Touching on NVIDIA Surround / 3D Vision Surround

Touching on NVIDIA Surround / 3D Vision Surround


During CES, NVIDIA unveiled their answer to ATI’s Eyefinity multi-display capability: 3D Vision Surround and NVIDIA Surround. These two “surround” technologies from NVIDIA share common ground but in some ways their prerequisites and capabilities are at two totally different ends of the spectrum. We should also mention straight away that both of these technologies will become available once the GF100 cards launch and will support bezel correction management from the outset.


NVIDIA Surround

GF100-45.jpg

Not to be confused with NVIDIA’s 3D Vision Surround, their standard Surround moniker allows for three displays to be fed concurrently via an SLI setup. Yes, you need an SLI system in order to run three displays at the same time but the good news is that NVIDIA Surround is backwards compatible with GTX 200-series cards in addition to forwards compatible with GF100 series parts. This display method can display information across three 2560 x 1600 screens and allows for a mixture of monitors to be used as long as they all support the same resolutions.

The reason why SLI is needed is because both the GT200 series and the GF100 cards are only capable of having a pair of display adapters active at the same time. In addition, if you want to drive three monitors at reasonably high detail levels, you’ll need some serious horsepower and that’s exactly what a dual or triple card system gives you.

This does tend to leave out the people who may want to use three displays for professional applications but that’s where NVIDIA’s Quadro series comes into play.


3D Vision Surround

GF100-46.jpg

We all know by now that immersive gaming has been taken to new levels by both ATI, with their HD 5000-series’ ability to game on up to three monitors at once, and NVIDIA’s own 3D Vision which offers stereoscopic viewing within games. What has now happened is a combining of these two techniques under the 3D Vision Surround banner, which brings stereo 3D to surround gaming.

This is the mac-daddy of display technologies and it is compatible with SLI setups of upcoming GF100 cards and older GT200-series. The reasoning behind this is pretty straightforward: you need a massively powerful system for rendering and outputting what amounts to six high resolution 1920 x 1080 images (two to each of the three 120Hz monitors). Another thing you should be aware of is the fact that all three monitors MUST be of the same make and model in order to ensure uniformity.


All in all, we saw NVIDIA’s 3D Vision Surround in action and while it was extremely impressive to say the least, we can't give any more thoughts about it, more testing on our part must be done.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
The Reintroduction of NVIDIA's Power Packs

The Reintroduction of NVIDIA's Power Packs


Some of you may not remember this or it went completely unnoticed but back in April of last year, NVIDIA released what they called their “Graphics Plus Power Pack #3”. This package of demos, screensavers and applications was meant to showcase GPGPU technology at its finest while opening users eyes to the possibilities that resided within their GPUs. With the release of the GF100, NVIDIA will be introducing a new list of interactive demos and applications. In this section we take a look at two programs we’re sure will appeal to many of you.


SuperSonic Sled

<object width="640" height="385"><param name="movie" value="http://www.youtube.com/v/EIB412tJ7Rc&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/EIB412tJ7Rc&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"></embed></object>​

The SuperSonic Sled as NVIDIA calls it is a real-time demo that showcases nearly every aspect of DX11’s compute and shader code to real-time physics simulations to 3D Vision. While the video we have above only shows a small portion of the demo, we can tell you that a viewer is able to do camera movements and add obstructions in real-time on the GF100 architecture and it looks simply stunning to boot. We’re looking forward to being able to manipulate it ourselves. For the time being though, here’s a few more screenshots from a later part of the demo that showcase the motion controls you will have access to for control of the sled.

GF100-37.jpg
GF100-38.jpg



NVIDIA RTD

GF100-39.jpg

At the GPU Technology Conference, NVIDIA demoed what they called Real Time Raytracing by showing how quickly and accurately their GF100 was able to analyse and render a scene in real time. It was impressive to say the least and the engine it used has now been ported to a demo which features dozens of cars which can be rendered for screenshots. This isn’t what we would call “fun” per se but it does a good job of showing what the architecture can do so its inclusion will be welcome.

GF100-40.jpg
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
GeForce GF100: Initial Impressions

GeForce GF100: Initial Impressions


After looking back on the pages and pages of text and images we have in this article, it is very easy to feel totally overwhelmed. The GF100 architecture is a complicated beast but we sincerely hope all of the information put to rest some of the questions many of you have been asking for the better part of a year now. However, there should be little doubt in your minds that the next few months will still be filled with speculation about GF100 and its derivatives.

Before we go on, it is important to mention that while NVIDIA was extremely open about the architecture and potential capabilities of the GF100, there’s a paper as long as my arm listing items they couldn’t talk about. On things like memory allotments, power consumption, availability, name (no, it won’t be the GF100), clock speeds, core count on retail cards and even die size, NVIDIA’s reps pled the fifth. While this may give rise to some off-color comments, eye rolling and snide jokes from the usual people in the peanut gallery, we actually understand why they chose not to show their hand just yet. In this high-stakes game of GPU Poker, the less the competition knows about your hand the better and like any good poker player, NVIDIA isn’t giving anything away. It should also be mentioned that we aren’t providing you with any pictures of the card simply because the final heatsink and PCB layout aren’t finalized yet.

One of the major mysteries was and still is what the specifications of the GPUs used in the tests systems were. When asked, NVIDIA simply grinned and stated the performance showcased what consumers can expect come release day. Based on what we hear about yields and TSMC’s manufacturing issues, we highly doubt a 512 core GF100 will be available when the architecture sees the light of day in the retail market. That’s not to say there will never be a 512 core GF100 available, since we know there will be one at some point. However, if that was truly not the highest-end GF100 in the test system we benchmarked, gamers will be in for a real treat in the near future.

When it comes to architecture, it is refreshing to see a chip that has been built from the ground up to cater to next-generation APIs. Whether they like to admit it or not, ATI basically took DX11 features and popped them onto a more powerful version of their HD 4000 series and called it a day. However, they were the first to market and made ridiculous profits while NVIDIA sat back and designed a whole new architecture around the thought of offering the best possible DX11 performance. Who will have the last laugh? Only time will tell but ATI does have a strong lead.

GTC-20.jpg

The increased rendering efficiency does seem to go a long way in making the GF100 a viable solution for the future but let’s put this into context for a minute. This is one complicated GPU and as such has proven to be extremely hard to produce in sufficient quantities as is evidenced from it being MIA in the current marketplace. Meanwhile, the overall complexity and integrated cache contribute to that massive 3 billion transistor count which leads to a cause for concern on the power consumption front as well. So, while we do know more than we did last month about the GF100, there will still be some fine tuning needed on NVIDIA’s part before its ready for primetime. What we can say is that we are confident the 512 core GF100 should consume under 300W under load.

On the plus side, the GF100’s architecture really does seem to be a departure from the GT200 in terms of scalability. As was discussed in the Modular Architecture section, NVIDIA is aiming to have a chip that can efficiently scale down for appropriate markets. This will eventually lead to a long line of GF100 derivatives and finally replace the endlessly renamed G92-based cards of yesteryear. To make matters even better, this new class of cards will fall under the umbrella of NVIDIA’s Unified Driver Architecture program so they will be able to continue releasing one driver for all of their products.

Price is of course a concern for many people and considering the size, complexity and development overhead on a chip such as the one that graces the GF100, it won’t be cheap. However, NVIDIA has said numerous times that this will be cost competitive with similarly-performing ATI solutions but that isn’t too helpful considering the huge grey area between the $400 HD 5870 and the near-$700 HD 5970.

What we showed you today is just the tip of the iceberg when it comes to the GeForce GF100. Actual performance with the final clock speeds and proper drivers remains the million dollar question but from what we have seen, there's a lot to be excited about. While it may be what we call “late”, the GF100 sure looks ready to take on everything the competition has to offer when it is released in a few months time.



 
Status
Not open for further replies.

Latest posts

Top