Fast, gray NVMe drivers blur the line between storage and memory


Much of what we call hardware is actually software, firmware, which uses the underlying hardware for speed and, importantly, a stable operating environment. The engineering dilemma is a juggling act: Can we build a product faster a) by running easily updated software on a commercial processor; b) faster startup software such as firmware on a commercial processor; or c) do we build a hardware engine to run the bare metal protocol for maximum speed?

Engineering challenge

Each option involves different levels of investment, which generally excludes the expensive option of a dedicated hardware engine. But the true potential is difficult to understand because engineers and academics often operate using heuristics or simulations that attempt to model real-world conditions. However, the simulations are as good as their design assumptions and test workloads.

PCIe 6.0

NVMe (non-volatile memory express), the latest and greatest storage interconnect, is primarily based on PCIe, a standard undergoing rapid evolution. The PCIe specification is currently at 5.0, with 64 gigatransfers per second. PCIe 6.0, which will launch next year, is expected to increase that to 128 GT / s. But today, even PCIe 5.0 compliant products are rare.

With each increase in PCIe performance, the tradeoffs change. What is the optimal depth of the tail? How do different memory technologies, such as PRAM or MRAM, affect collation strategies? If a controller’s FPGA clock rate increases for performance, how does that affect time in a multi-core processor? The problem of optimizing, or taking advantage of higher performance, requires research.

NVMe drivers present a complex case of these tradeoffs. How can engineers untangle this mess of architectures, protocols, interconnects, clock rates, workloads, and strategies?

OpenExpress

At last week’s USENIX Annual Technical Conference, a document from the Korea Advanced Institute of Science and Technology (KAIST) proposes a response: “OpenExpress, a fully automated hardware framework that has no software intervention to process concurrent requests for NVMe while supporting scalable data delivery, rich pending I / O command queues, and send / end queue management. “

Unlike expensive vendor research tools, the implementation uses a low-cost FPGA design that enables engineers and researchers to quickly and inexpensively play with multiple parameters, such as request sizes and queue depths, to See how controller architecture and firmware changes affect real-world performance. .

This means that for any given engineering budget, designers will be able to run many more experiments on different design offsets to bring you the most effective storage controllers (whoops!).

The take

This is how the future is invented. Imagine a storage controller and media that’s almost as fast as the fastest DRAM, as some of the NVRAM startups are promising, with a PCIe v 7 or 8 capable of handling more than 500 GT / s. That blurs the line between storage and memory.

We could then dump the entire overhead of virtual memory, with its context switches and translation of the virtual address to physical, in favor of a completely flat address space. There would be no logical difference between “memory” and “storage”. All accesses would be direct to the data. And the data, my friend, is why we do all this.

IBM pioneered the concept of flat address space decades ago in System 38. If you were an Apple engineer looking for future system architectures, you would definitely be researching this. Especially since Intel would be slow to admit it.

Comments welcome Have you ever programmed a System 38? Thoughts?