PCIe SSD vs all-flash for enterprise storage

by Jack @ UNIXPlus October 09, 2015

We run the rule over PCIe SSD and assess its potential as a server-side alternative to all-flash storage, as cache in conjunction with shared storage and in hyperscale environments


Solid-state storage, like rotating media, can slot into an enterprise computing architecture in a variety of ways.

And while much recent interest has focused on all-flash arrays, there are other key locations where flash-based storage can play a big role in accelerating application performance – and especially hyperscale computing performance.

Key among these is server-side flash, installed in a PCI Express (PCIe) slot where it can work as local disk storage or as cache for a server. The former replaces at least some of the requirement for all-flash array-style shared storage, while the latter can augment network storage or the server's main memory.

PCIe SSD has several advantages when used as storage local to the server. Not only is it rather cheaper than the same volume of all-flash array capacity, but PCIe eliminates the host bus adapter (HBA) or drive controller and its latency, plus the overhead from the network connection.

This can bring latency down from milliseconds to microseconds. At the same time, all the other benefits solid-state storage has over spinning disk are there, such as lower power consumption, heat generation and vibration, greater robustness and higher density.

Most SSDs today are based on flash but, to be accurate, we should speak of non-volatile memory (NVM) in general. That is because there are other memory types besides flash in use and in development that promise even better performance.

The problem with putting SSD in the server is the same problem you have with any direct-attached storage (DAS).

Software-defined storage

That is, unless you are going to run specialist disk-sharing software – for example, using software-defined storage tools to abstract local storage and pool it network-wide – then local storage is available only to its host server. But, as we will see, there are emerging use-cases such as hyperscale computing, where local SSD can actually be more useful and cost-effective than shared storage in the network.

So, the big advantage of all-flash arrays and other forms of shared storage is that it can be available to any connected system. Unallocated space can be shared, as can data on shared volumes. The former is a win because DAS is typically over-provisioned to allow for future growth and, of course, we typically over-specify by a considerable margin to avoid the pain of having to migrate or rebuild an overloaded server in the future.

Combined with technologies such as thin provisioning, where a logical volume takes up only the physical space on disk that its data requires and not its full provisioned space, shared storage has allowed considerable economies to be achieved. Also, once you add appropriate locking mechanisms, servers and applications can share access to the same data, yielding even more advantage. 

Server-side SSD on a PCIe board can still be useful in this environment as cache, of course. Caching pre-loads the most frequently accessed data, with the master copy still stored on, and synchronised back to, shared storage in the network. This can be a good way to boost the performance of specific applications. 

However, the possibility of a cache miss – where the required data is not cached and must be retrieved normally over the network – means that the resulting latency improvement is not consistent or guaranteed. 

This is made worse by the fact that some applications have data access patterns that are difficult to identify and cache accurately, making cache misses more likely. There are also issues involved in maintaining coherency between local cache and the remote master data, and questions over the suitability for caching of the large files that store entire virtual machines. 

More and more interest is therefore turning towards the idea of using server-side PCIe SSD as local working storage, helped both by the falling cost of flash in particular and by two key developments in IT. 

The first is the creation of specifications to implement enterprise-grade server-side SSD, and second is the evolution of new computing paradigms that make shared working storage more of a liability than an advantage.

Relevant standard

One relevant standard is obviously PCIe itself, which is currently at generation 3.0, with generation 3.1 coming and 4.0 under development. Architecturally, PCIe is serial, with the ability to consolidate or bond multiple lanes into a single high-speed point-to-point connection.

As well as PC expansion boards, there are 2.5-inch SSDs that use four-lane PCIe over a U.2 (formerly SFF-8639) connector, while two-lane PCIe is also part of the SATA Express (SATAe) specification for external storage that is intended to succeed SATA 3.0.

In addition, the Intel/Apple Thunderbolt interface incorporates four-lane PCIe, while there are specifications to extend PCIe outside the system box, turning it into a kind of short-range storage network but without the latency of a SAN.

Perhaps more important, in some ways, is NVM Express (NVMe), which is a specification to access solid-state storage on a PCIe device (whether PCIe, U.2 or SATAe). It allows the host to take full advantage of the parallelism and low latency of a PCIe-connected SSD, where formerly these devices required proprietary software drivers.

In effect, where PCIe cuts out the storage controller latency, NVMe removes most of the remaining software latency. NVMe means the operating system needs only one standard driver to support any NVMe SSD – and SSD developers do not need to create their own drivers, thus removing the scope for compatibility issues they can bring. It is also optimised for solid-state storage, unlike superficially similar technologies developed in simpler and slower days, such as AHCI.

(Not all server chipsets will support PCIe 3.0, though, so you need to check. You also need to ensure your server and your storage product support NVMe and the right level of cabling.)

Hyperscale computing

Then, building on that underlying evolution in the server platform comes the paradigm shift toward hyperscale computing.

Legacy applications were built on the assumption that high reliability would be baked into the infrastructure, but the same is not true of the modern scale-out design models popularised by the likes of Google and Facebook. Here, the hyperscale concept comprises independent yet connected nodes where redundancy is at the level of entire server/storage nodes rather than components in the server and shared storage architecture.

So where legacy applications require shared storage in case they need to failover, and are therefore better suited to an all-flash array-type deployment, modern scale-out applications can have thousands of clustered nodes. Each node is typically based on off-the-shelf hardware and provides compute, storage and networking resources.

The nodes are then clustered together and managed as a single entity, replicating among themselves for availability via distributed storage platforms such as Apache Cassandra, Ceph and Hadoop  – you could even call the result a redundant array of inexpensive servers.

All of a sudden, the advantage of having local storage with super-low latency becomes greater than the disadvantage of that storage not being shared. 


Original Article Found Here:

Jack @ UNIXPlus
Jack @ UNIXPlus