What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

Intel's Optane & DC P4800X; A Deep Dive

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
In July of 2015 Intel and Micron's joint venture IMFT (Intel Micron Flash Technologies) announced a breakthrough in the Non-Volatile Memory storage: 3D XPoint (pronounced 'Three Dee Cross Point'), a technology that promised to replace NAND-based memory and take non-volatile memory to the next level. The initial promise was a new memory type with the latency and performance of Random Access Memory (aka volatile memory), drastically improved endurance compared to NAND memory, <i>and</I> better density than RAM. It seemed at the time like the holy grail everyone was looking for.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/3dxp4.jpg" border="0" alt="" /></div>

Those original announcements boldly proclaimed that 3D XPoint would be '1000 times' faster than NAND, '1000 times' the endurance of NAND, and '10 times' the density of RAM. It was a veritable shot across the bow of both other non-volatile memory manufacturers and volatile memory manufactures alike.

Recently we were invited to a tech conference at Intel's offices in Folsom, California where this promise would finally be brought to fruition. As with everything storage related, time did march on and as such what we heard… did not quite live up to the initial 2015 announcements. This is however the start of a new era where the very boundary lines between volatile and non-volatile begins to blur. An era where RAM is no longer the undisputed king of performance, is no longer the only answer to increase overall I/O requests per second and lower server latency. Yes, it is indeed one bright future for Intel and a less rosy one for those traditional RAM manufacturers.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/evo.jpg" border="0" alt="" /></div>

Now that the NDA is up we can give you a sneak peak at this technology and give you some insight into Intel's plan for the future. This is a future where Optane Technology not only rewrites the requirements for a high-performance server, but also what a typical home computer needs to be equipped with in order to be considered "good enough".

As the NDA is still not lifted on Intel's incoming salvo for the consumer marketplace, today's article will focus solely on what Intel's first generation Optane can offer enterprise consumers… and boy we were impressed. Read on as we explain exactly what 'Optane Technology' is and what it promises to offer the Enterprise market. But first, a bit of a warm up backgrounder for everyone.


Volatile vs Non- Volatile Memory


Historically speaking memory always came in two separate and distinct groups – Volatile Memory and Non- Volatile Memory. Typical Volatile Memory is Random Access Memory and DDR4 SDRAM is the latest and greatest for servers.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/dram.jpg" border="0" alt="" /></div>

DDR4 SDRAM or 4th generation Double Data Rate Synchronous Dynamic Random-Access Memory (which is a mouthful and why it is usually abbreviated as DDR4 or even just 'RAM') is a technology that dates back to the 1970s. While the speed and density has vastly increased over the years, at its heart 'RAM' is based upon a simple idea: storing a single bit of data in a capacitor. This capacitor can either be charged or discharged and this is what separates a binary 0 from a 1. However, as it relies upon capacitors the electricity leaks over a relatively short period of time making the data stored in the capacitor corrupt and unusable.

This leakage happens all the time, even when the server is powered up which is a big reason why servers make use of special 'ECC' DDR4 RAM. This special form of DDR4 utilizes a low level Error Correction Code to ensure that any data retrieved from the RAM is actually what was supposed to be stored. Basically this server-grade memory has an additional IC 'chip' on each 'stick' that stores a parity check that the system can use to insure that what the rest of the RAM is sending back to the host is what it was supposed to be transmitting.

All of this is why RAM is considered volatile, or 'short term' memory. On the positive side, RAM is incredibly fast memory storage. Its durability is also unimaginably high as very little damage is done by changing the state of each capacitor; and each Dual In-Line Memory Module (ie the 'stick' that the DDR4 RAM 'chips' are placed upon) has a fairly high density of up to 512GB.

This is actually why non-volatile memory came about: no-one wanted to have to feed in all the data needed every time someone rebooted the server. Historically the non-volatile memory of choice was Hard Disk Drives (and to a less extent Tape Drives) but in the enterprise, it has become synonymous with Solid State Drives. Now granted a Solid State Drive is in itself just a catch-all term for the various forms of this medium available but for the most part it means NAND floating gate transistor-based 'flash' memory.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/nand.jpg" border="0" alt="" /></div>

NAND or "Not AND" floating gate transistors store data by physically changing each gate's position so that when electricity is passed through it the resistance will vary depending on what data is 'stored' in that cell. Since it is a physical change the data is non-volatile and does not require constant electrical source to keep it state. This means when power to the server is (temporarily) removed the data is safe and sound. The downside is compared to 'RAM' the speed at which it can respond to I/O requests is snail slow.

Also compared to RAM, or even more fragile Hard Disk Drives, NAND storage is also incredibly fragile with total drive write ratings in the Petabyte or even Terabyte range. This is why Solid State Drives with a 'RAM Stick' interface never caught on and why RAM is still the memory speed king of the computing world. But this is where Optane steps gracefully into the equation.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Say Hello to Intel's Optane and the DC P4800X

Say Hello to Intel's Optane


<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/solution.jpg" border="0" alt="" /></div>

Optane technology is the answer to four issues that Intel has identified with existing Enterprise systems. These issues are loosely grouped as 'Endurance', 'Latency', 'IOPS Performance', and 'Consistency'. Some of these four building blocks do overlap – as performance and consistency do go hand in hand- but Optane Technology sets out to answer all four at the same time. In order to do this Intel has leveraged existing standards and combined them with new technology to create something entirely new.

That is the background details of Optane Technology, and sets the lens through which this technology has to viewed and judged. So exactly what is Optane Technology and what does it consist of?

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/spec.jpg" border="0" alt="" /></div>

At its heart resides NVMe and its greatly improved Host Controller Interface communications protocol. This PCIe-based interface removes the 'middleman' or Platform Controller Hub by the simple expedient of moving part of the storage controller duties onto the non-volatile memory device and the rest directly to the CPU.

In the simplest terms NVMe devices talk directly to the CPU via the PCIe bus and need not take any side trips through secondary controllers. As we have shown numerous times in past this removal of both the SATA communication protocol and the PCH dramatically improves overall performance, reduces latency and generally makes a system be much more responsive. Put another way, it allows a server to get more done in less time by making the CPU wait less on I/O requests to/from the non-volatile memory. This is why NVMe based storage has become the de-facto standard for high performance servers.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/nvdimm.jpg" border="0" alt="" /></div>

Upon this solid foundation, Intel has built the ground floor of their Optane Technology by using a second generation NVMe controller. At this time details on exactly what this controller consists of are few and far between. What is known is what it can accomplish via Intel's upcoming Optane DC 4800X drive. However, its true performance may indeed be bottlenecked by using a 4 lane PCIe interface and future NVDIMM form-factor iterations could provide much higher performance.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/3dxp5.jpg" border="0" alt="" /></div>

To harness the full potential of the NVMe interface and their next generation controller Intel has paired this duo up to truly next generation non-volatile memory. Intel was extremely coy on specifics but what is known is Optane Technology doesn't use typical NAND solid state memory and rather uses 3D XPoint non-volatile memory technology instead.

3D XPoint does not rely upon NAND floating gate transistors and rather uses an entirely new way of storing data in a 3D Matrix. Most likely instead of using a transnational gate, the properties of the cell material itself changes and remains so long enough to be considered non-volatile. How long will they stay changed before 'reverting' back to their base state is unknown but at this time the NAND we are all familiar with is rated in mere months for data stability – so in order to be merely as good as TLC 3D NAND Intel does not have to clear that high a bar.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/3dxp.jpg" border="0" alt="" /></div>

Much like 3D NAND, 3D XPoint builds 'up' and not just in two dimensions like old-school planar NAND. This allows the density of a next generation 3D XPoint IC to be downright massive in comparison to what was available back in early 2015. The recent advent of 3D NAND is however why Intel was unable to keep their promise of improved density and '1000x the endurance' as NAND. Both have continued to evolve and use smaller fabrication nodes so that was impossible. Basically, when comparing SLC 3D XPoint density to 3D TLC NAND density, 3D XPoint is not going to offer the improvements it could have been if it had been launched before 3D TLC NAND memory.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/bar.jpg" border="0" alt="" /></div>

What 3D XPoint does however is improve endurance well beyond what modern NAND can offer. Some of this is due to the memory technology used, but a lot has to do with the fact that 3D XPoint is SLC in nature. Put another way, instead of trying to cram two or three bits of data inside one cell 3D XPoint only stores one bit. This means a cell is either 'on' or 'off'. This alone allows for much greater endurance compared to common NAND implementations and also greater responsiveness.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/endurance.jpg" border="0" alt="" /></div>

As for specifics on endurance, Intel is rather tight lipped as 3D XPoint memory is immature right now. However, simple extrapolation from the Intel Optane DC P4800X specifics net us a ballpark number. Specifically, the 3D XPoint that will be available to enterprise consumers can handle 30 drive writes per a day, every day for five years and then some.

A conservative extrapolation would place 3D XPoint endurance at approximately 50,000 cycles versus the 3,000 to 5,000 cycles modern MLC NAND is rated for – or a 'mere' 10x endurance improvement. This is a far cry from the 3 to 5 <i>million</i> cycles it would need to meet those initial claims but Intel is adamant that as the technology matures this endurance will increase and not decrease like NAND memory has.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/1000.jpg" border="0" alt="" /></div>

Future predictions aside, this is a first-generation device so Intel is most likely being incredibly conservative in their estimates. A good portion of how Intel claimed a '1000x' endurance improvement is in real world usage scenarios and not the unrealistic method used for NAND.

This is a possible explanation as the controller is going to be incredibly gentle on the 3D XPoint memory compared to NAND thanks to what Intel calls 'Write in Place Technology'. Write in place technology is a non-destructive write process that takes a page from RAM and is <i>byte</i> addressable and not block addressable. Thus only a mere eight cells needs to be to change its state, instead of writing – or erasing - an entire block at a time.

Could this actually be the secret to how 3D XPoint can claim such low latency? Possibly since each cell does not need a slow transistor and can be written to or read from almost directly. This is actually where the 'XPoint' of the name comes into play as woven through the entire '3D block' of memory are cross point wire connectors. Above and below each cell is a positive and negative wire. In each cell is a selector that actually controls the cell state. These wires talk directly to each cells connector and allow for much finer grain control when compared to NAND which only reads at a large block cluster level. This difference does improve overall data transmission efficiency. This improvement nets a higher internal communication speed, and a lower latency for completing I/O requests.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/3dxp3.jpg" border="0" alt="" /></div>

Utilizing this structure, Performance not only skyrockets but internal housekeeping is greatly simplified. It allows the NVMe controller to waste fewer cycles on these necessary tasks while still ensuring that there are always free cells ready to be written too. This incredible granularity is also why the Intel Optane DC P4800X can boast such massive improvements in shallow <i>and</i> deep queue depth performance - as each write requests requires such little overhead compared to NAND.

In this, Intel has set expectations high as they claim a 35X improvement in deep queue depth performance compared to the monstrously powerful DC P3700 <i>and</i> up to 10x the improvement in shallow queue depth performance over that self-same model. That is rather impressive when you consider the DC P3700 was arguably the best enterprise storage device – up until now.

Now to throw a little bit of cold water on the parade. This improvement in overall performance is not 1000x that of the best NAND based memory storage devices, nor is it a 1000x improvement in latency. Once again Intel was a tad late to bringing 3D XPoint to market and in all likelihood the NVMe protocol was never designed with NAND in mind. Rather it was meant for 3D XPoint from the get go. This does explain some of the more curious details in the NVMe protocol and many an expert did wonder aloud about how NAND would ever be able to fully harness such specifications without running into massive engineering issues. In either case, it is why the real world latency improvement is 'only' 20-25 times that of say a NVMe NAND based DC P3700, and the real world performance is 'only' up to 77 times that of the DC P3700. Yes that is massive and we doubt many will be overly upset with such generational improvements, but it is not a thousand times greater.

That however is not being entirely fair to Intel's Optane Technology as we have saved the best for last. This new technology may use NVMe as its foundation, and its ground floor may be 3D XPoint, but Intel has added an entire new level to the Optane infrastructure. Unlike past generations of hardware that were pretty much developed independently of other developments, Intel's Optane Technology department worked hand in hand with other Intel development teams. The end result of this co-operation is Intel Memory Drive Technology. This new technology is the secret weapon in Intel's Optane Technology's arsenal and solves not only the latency issue but also the performance issue. The end result is a device that can take on DDR4 RAM roles. We will go over this in the next section.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Memory Drive Technology: Solving the Latency Issue

Memory Drive Technology: Solving the Latency issue


<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/block.jpg" border="0" alt="" /></div>

When dealing with massive I/O requests per second the largest bottleneck is not the CPU or even the size of the BUS; it is the latency of the memory that is used. In the past non-volatile memory (aka 'NAND storage') had latency that was measured in microseconds (µs), whereas volatile memory (aka 'RAM') has latency that was measured in nanoseconds (ns). For example a top of the line Intel Data Center P3700 NVMe SSD has a real world latency of about two hundred thousand nanoseconds (200µs), whereas ECC RAM has a real world latency of about two hundred nanoseconds to complete a simple memory access I/O request at low I/O levels and one thousand nanoseconds under heavy loads.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/lat.jpg" border="0" alt="" /></div>

In basic terms this means that RAM has a latency that is about 1,000 times lower than NAND based Solid State Storage. Intel's Optane DC P4800X has a latency of under 10,000 nanoseconds (10µs) or only 10-50 times slower than RAM. On its own that is still a massive difference. However, latency is only half the equation. The other half is memory management and load management. This is where the true secret to Optane's revolutionary leap forward happens.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/lat2.jpg" border="0" alt="" /></div>

The typical Integrated Memory Controller inside a processor has not changed all that much generation from generation. The reason is that CPU engineers did not need to refine this controller – as nothing was able to even come close to its transient abilities. This means load balancing, and out of order execution is rather primitive and rather basic compared to modern 'storage' controllers. Once again this is because it did not need to be refined and the largest performance boost in recent memory was placing it directly on the CPU. RAM is simply the fastest external memory going and only on-core (L1 at ~0.5ns or L2 at ~7ns) memory is faster so improving this area of the CPU was down on the list of priorities. Intel's Optane engineers on the other hand did have to expend the effort and had to make it a priority to cope properly with Optane.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/perf.jpg" border="0" alt="" /></div>

In most simplistic terms possible, a typical Optane implementation does more with less thanks in part to its next generation NVMe controller and more elegant Non-Volatile Memory Host Controller Interface communications protocol. This protocol is cutting edge and has boosted NAND performance to heights Intel never foresaw when they started to talk about 3D XPoint. Though the majority of the reasons Intel can claim that the Intel Optane DC P4800X can offer 80 percent or more of the performance of RAM without even being close to it in latency is not because of NVMe. Rather it is because of Intel Memory Drive Technology.

MDT is a brand-new technology that resides between the BIOS and the OS and can make the OS believe that a drive like the DC P4800X is indeed memory and not storage. This however is only the tip of the iceberg. In addition to being seen as 'RAM' this new technology can in real-time load balance and even re-organize incoming I/O requests so that the CPU is not waiting as long as it was accessing the IMC and the RAM bus. This is what allows the Intel Optane DC P4800X's real world performance to be very similar to that of RAM.

<div align="center"><img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/ram.jpg" border="0" alt="" /></div>

During the recent Intel Optane conference, Intel used the example of a 675GB MySQL database. To maximize performance of such a data base usually means loading the entirety of it into memory and letting the server churn through the requests without ever accessing the 'slower' non-volatile storage. To do show this level of performance Intel equipped a Dual XEON E5-2699 v4 server with 768GB of DDR4 ECC RAM, DC P3700 SSDs and ran it. The result was 1,077 Transactions per second. Then they removed 512GB of RAM and replaced this expensive memory with four 375GB Optane DC P4800X drives and reran the test.

Since the database could not reside in memory it had to use the Optane drives. The end result? 8,70 transactions per second or 80.8% of the performance. If this reflects actual real world performance that is indeed not too bad when you consider high density ECC DDR4 RAM costs over ten US dollar per gigabyte and Intel Optane DC P4800X will be in the four dollar (USD) per Gigabyte range.

There is of course another caveat. Intel MDT will only work with XEON CPUs and requires a motherboard be capable of doing this magic. It is also most likely very limited in the Operating Systems it can work with as well. Thus older generation hardware need not apply, nor should home enthusiasts get their hopes up about it working with vanilla Windows 10. L/Unix users on the other hand may indeed gain a few converts if Intel MDT technology works out in the real world as well as it did in Intel testbeds.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Summarizing the Next Steps

Summarizing the Next Steps


As you can see Intel is extremely excited about Optane Technology and based on their briefing, they have every right to be. It may not be the devastating 1000X improvement that was promised two years ago, but it still is a paradigm shifting technology. The benefits that come with the ability to swap out expensive DDR4 ECC memory for less expensive Optane memory really cannot be overstated, nor can its promises of drastically reduced latency.

These two promises alone make Optane a serious alternative solution for a bevy of Enterprise related roles. These are roles that vary from database servers, to cloud storage solutions and nearly every server in between may indeed benefit from a timely update to an Optane DC P4800X. We are not alone in thinking that this could indeed be a paradigm shift as Intel already has Alibaba, Dell, HP, Lenovo, Microsoft, MySQL, and even VMWare waving the Optane flag.

<div align="center">
<img src="http://images.hardwarecanucks.com/image/akg/Storage/optane/last.jpg" border="0" alt="" />
</div>

For our readers who are involved in the Enterprise sector of the market we strongly recommend paying close attention to Intel in the coming months. For more SOHO and even home users we will investigate another Optane solution in the coming weeks. For now though, the Optane DC P4800X may look like an interesting idea but one that can never see being useful for a broad range of situations. Nothing could be further from the truth as Intel has a long-standing tradition of servicing the Enterprise first and then releasing consumer-orientated products based upon the Enterprise technology later. One need not look back further than the Intel 750 for example of this methodology at work.

Until a '750'-style model is released Optane should still be on everyone's radar. This technology is not just for servers and may indeed pay dividends in home environments. Promises of being 3-10 times faster than a DC P3700 can easily translate into a future promise of an Intel 750 successor that is <i>also</i> three to ten times faster than its predecessor.

With that thought in mind, and in conclusion, a famous quote by John F. Kennedy comes to mind when thinking about the Intel's Optane Technology. That quote is not "The future's so bright, I got to wear shades", and rather is "Change is the law of life. And those who look only to the past or present are certain to miss the future". This is precisely what Intel is looking to accomplish. They did not look at past or present successes when creating Optane Technology. Rather, they envisioned a future that questioned the very definition of what separates volatile from non-volatile memory. Let's all hope that their vision of the future is indeed proven correct by real world usage of the DC P4800X along with other upcoming Optane technology-based products.
 
Last edited:

Latest posts

Top