Refreshed Memory Caching & A New Video Engine
Refreshed Memory Caching
As resolutions continue to increase and DX12 promises to enhance GPU to CPU communications, the graphics card’s onboard memory will begin playing an increasingly expanded role in overall game performance. Knowing this, the GTX 980’s 4GB, 7Gbps 256-bit memory interface may look a bit anemic and anything but “next gen” at first glance. On paper at least it provides less bandwidth than the GTX 780Ti’s layout, but there’s far more going on behind the scenes than what first meets the eye.
In creating GM204, NVIDIA has thoroughly revised their memory subsystem rather than enhancing speed or moving towards a wider interface. Focusing upon core architectural improvements over raw power was borne out of necessity since a larger 384-bit or 512-bit interface would have taken up a significant amount of die space while GDDR5 modules operating at frequencies higher than 7Gbps aren’t available yet.
While the core’s caching and memory hierarchy has largely gone unchanged from GK110, NVIDIA’s engineers have instituted a number of features that allow GM204 to utilize its available bandwidth more efficiently. These enhancements begin with a new third generation lossless compression data algorithm as the core’s data is written out to memory. Additional bandwidth savings are achieved when secondary processing stages like the core’s Texture Units read data stored within the memory buffer since the information is compressed yet again for quick transfer.
In order to enhance output bandwidth and memory subsystem efficacy, NVIDIA is actually using multiple layers of compression. In principle this approach works well in some instances (like on anti-aliased surfaces) but basic compression methods struggle to cope with lower instances of AA and non-AA situations. This is where delta color compression mode comes into play.
First instituted in Fermi, DCC quickly calculates the difference between each pixel within a data block and its adjacent neighbor and then attempts to compress the values together into the smallest size data packet possible. The effectiveness of this method is largely determined by the algorithm’s built-in calculation possibilities which is why, in this third generation, NVIDIA has added more delta compression “choices”. This leads to less data being determined as uncompressible and sent on through the rendering pipeline in raw lossless format, hogging bandwidth.
Alongside the compression algorithm improvements, core caching and general data access has been addressed as well. As a result, the GM204 core is able to reduce the number of bytes that have to be fetched from local memory and thus reduce memory bandwidth overhead.
Practically speaking NVIDIA claims up to a 25% improvement in efficiency when compared directly against Kepler’s capabilities which gives Maxwell the equivalent of 9.6Gbps of effective bandwidth in some situations. It should be interesting to see whether this will help in higher resolution scenarios, where lower bandwidth cards tend to struggle.
Maxwell’s New Video Engine Explained
NVIDIA’s Kepler architecture had an extremely robust video engine and NVENC encoder, as evidenced by its capability to pre-process and stream high definition content to NVIDIA’s SHIELD. However, in a market that’s increasingly gravitating towards 4K resolutions alongside technologies like G-SYNC and DisplayPort’s Adaptive Sync, some revisions were in order to make sure Maxwell was natively able to support upcoming display features.
One of the primary additions to this new video engine is its capability to provide an extreme amount of output bandwidth. It boasts full support for upcoming 5K (5120×3200) resolutions at 60Hz and up to four 4K MST displays can be driven from a single card. Alternately, these capabilities also open the door to 120Hz and 144Hz 4K panels. In addition, GM204-based products will be the first to boast native support for HDMI 2.0 which brings with it 4K / 60Hz capabilities along with and enhanced HD audio backbone. (eDP) protocol has been rolled into Maxwell as well. In order to create a connectivity standard that is compatible with these formats, the GTX 900-series uses a single dual link DVI output as well as three DisplayPort 1.2 connectors and a single HDMI 2.0 port.
Further helping things along is an updated NVENC encoder that now includes H.265 support and features H.264 encoding that’s roughly 2.5x faster than Kepler. Not only will this allow for 4K video capture at 60Hz within ShadowPlay (an option which will be added immediately to NVIDIA’s application) but H.265’s compression improvements have far-reaching ramifications for home video streaming. With H.265 a user could technically stream a 4K video from their gaming PC to a next generation SHIELD portable device and then pass that signal onto their 4K HDTV. Alternately, the encoding efficiency could also lead to drastically reduced lag times when using NVIDIA’s GameStream technology.
|Latest Reviews in Video Cards|