Flash storage for servers is getting faster and smarter
on Dec 17, 2019
The following article is based on German data storage magazine Speicherguide.de’s interview of Thomas Arenz from Samsung Semiconductor Europe. Since 2003 Speicherguide.de has been producing comprehensive market and product overviews and reporting on strategic trends to offer reader-friendly information based on sound technical knowledge.
The coming decade will see major advances in flash-based storage media in server and data center environments. To find out more about the latest trends, and the innovations that are driving these changes, we spoke to Thomas Arenz, Director of Marcom & Strategic Business Development at Samsung Semiconductor Europe.
What kinds of new technology can professional data storage users expect to see in the coming decade?
Arenz: PCIe Gen4 without a doubt. We have 19 different models in the PM1733 and PM1735 SSD series which use this technology and support AMD Epyc 7002 processors. Using the full bandwidth of four PCIe lanes offers sequential read speeds of up to 8 GB/s and up to 1.5 million random IOPS (input/output operations per second). That’s significantly more IOPS, and almost twice the sequential bandwidth, of the previous generation.
In addition to the hardware - which is available in U.2 and HHHL form factors and in capacities from just under 1TB all the way up to 30TB - we have also optimized the software. These series thus include three next-level innovations: fail-in-place, virtualization and machine learning technology.
Flash Storage is Getting More Intelligent
What exactly does “fail-in-place” mean?
Arenz: Fail-in-place is a really important milestone because it means that even if an error occurs at chip level, the SSD as a whole will still be accessible. When you consider that a 30 TB SSD has 512 NAND chips, the advantage of fail-in-place becomes pretty obvious - instead of having to hot swap the SSD and deal with all the knock-on effects, you can rely on internal error correction systems which ensure that the SSD remains stable and continues to perform at a high level.
Our SSD virtualization technology makes it possible to subdivide an individual SSD into up to 64 units, providing multiple independent virtual workspaces. This will allow cloud storage providers, for example, to extend their services to a greater number of users without the need to utilize additional physical resources. The technology also enables SSDs to take over the task of single-root I/O virtualization (SR-IOV), which is typically carried out by server CPUs. This makes it possible to provide the same level of service while using fewer server CPUs and SSDs, thus reducing the overall server footprint.
Our V-NAND machine learning technology helps in accurately predicting cell statuses and detecting anomalies at this level through big data analytics. This technology is increasingly important in assuring data reliability, as increasing SSD speeds and rapid voltage pulses are creating new challenges when it comes to reading and verifying data. Although it requires much more precise cell control than 3-bit NAND, an SSD with 4-bit NAND is more reliable and is able to generate the higher speeds and capacities needed for use in server and data center storage systems.
You mentioned that flash storage is not only getting faster but also smarter. What exactly does that mean?
Arenz: Together with our partners, we have spent a lot of time ensuring that our standard SSDs include “smart” components that minimize data traffic between the SSD and the CPU.
This focus is a response to a growing problem facing our current storage architectures: over the last 20 years, the capacity of our storage solutions has increased about eight times faster than the available bandwidth of the interconnect with the CPU has. That’s a massive gap, and it’s continuing to grow despite some technological innovations like the aforementioned PCIe Gen4. As a result, available bandwidth per TB is also continuously decreasing.
Smart SSDs: Proving the Concept
So what can be done to resolve the dilemma of storage capacity outstripping transfer bandwidth?
Arenz: One of the recent approaches to tackling this issue is called computational storage; we refer to this internally as “Smart SSD”. This solution requires rethinking conventional storage architecture. In this model, instead of providing the CPU with the raw data needed to perform a task, the SSD provides the ready-made results for the original request.
All the innovations attached to buzzwords like AI, deep learning and big data are essentially based around analyzing huge amounts of data from sources like sensors or databases, detecting patterns and applying them to new contexts. If we want to do this in real time, we need to massively accelerate the journey from task to solution. I suppose you could say that Smart SSDs create a modern division of labor: instead of relying on the CPU or the accelerators to perform all the calculations, this technology allows the SSD to take on a significant portion of the operations itself, thus greatly reducing the load on the server CPUs.
Over the last 18 months, we have added a Xilinx FPGA to our mainstream NVMe SSD and provided this prototype to a few select customers for use in I/O intensive workloads. One of these proofs of concept was for one of the largest hedge funds in the US. They need to be able to analyze share prices multiple times per second on a constant basis, which requires data volumes of half an Exabyte. We integrated their algorithm into the Smart SSD, which more than tripled the data throughput. In a high-frequency trading environment, microseconds can be the difference between a million-dollar loss and a million-dollar gain.
In another proof of concept using an industry-standard application based on MariaDB, it was determined that around 80% of the requested data had to be filtered out by the CPU. We offloaded this task to the Smart SSD, resulting in an increase in processing rates roughly equivalent to what we experienced in the hedge fund proof of concept.
Could you give us another example?
Arenz: Our partnership with an airline (based on Spark) provided another important insight: the improvement in performance brought about by Smart SSDs is scalable to a significant degree. The more that were used in this particular environment, the more the query execution time decreased. IoT-connected planes generate petabytes of data on each flight. When they land, the airline then has to analyze this data in a kind of edge data center and turn it around very quickly before the plane is approved for its next flight. If it isn’t approved and the airline has to cancel the flight at short notice, it will incur significant costs due to rebooking, reimbursing and accommodating passengers. We were able to reduce the time required to analyze the data by 90% by using multiple Smart SSDs. This demonstrates the fantastic potential of Smart SSDs when it comes to dealing with the interconnect bottleneck.
Following these impressive results, we began the process of evolving the Smart SSD prototype into a fully-fledged product. We are currently working with our partners to equip the Smart SSD with the connectors needed for compatibility with commonly used databases and application frameworks, such as those used for video encoding or storage offloading. From a server perspective, a Smart SSD looks like a standard Xilinx FPGA and an NVMe storage device. This means that customers can develop their own solutions for use with Smart SSDs.
Form Factors and Standards
Which form factors do you use?
Arenz: Our prototype was designed for use as an add-in card with a discrete FPGA, PCIe switch and SSD controller. We are currently working on integrating the FPGA into the SSD controller and will offer the finished product in the U.2 form factor. Initial samples should be available over the next few months.
SNIA recently approved standard specifications for key value storage. What is the significance of that?
Arenz: The main purpose of a KV SSD is to transfer the workload involved in accessing data from the CPU directly to the SSD, which can massively speed up the process. However, instead of operations being performed inside the SSD, the data is handled in a completely different manner.
A standard SSD splits incoming data into blocks and actively deletes blocks before every write operation. This means that the write operation triggers two different steps: logical block addressing (LBA) and physical block addressing (PBA). This process is also responsible for the aging process in SSDs. Read operations do not cause any issues, but write operations cause material fatigue.
Key value technology, on the other hand, facilitates the object-based storage of data without the need to subdivide it into blocks first. Instead, a key is assigned to the data object (value) that is going to be stored, independent of its size. This allows data blocks to be addressed independently of storage locations and offers immense scaling potential.
In addition to speeding up access to data, skipping the LBA and PBA steps also increases the service life of the SSD. By not splitting data into blocks or writing to lots of little blocks, the technology minimizes the so-called Write Amplification Factor (WAF). In practice, it’s common to see different amounts of Total Bytes Written (TBW) because the WAF depends on the flash technology and on the intelligence of the controller. Generally speaking, the lower the WAF, the longer the service life of the SSD will be.
What are the prerequisites for using key value technology?
Arenz: In addition to a key value SSD with appropriate firmware, you also need an API to connect to the application. Samsung has been working on this for quite some time and provides open-source software, libraries and drivers, in addition to a Ceph backend for experimentation. We also support the SNIA’s standard API for key value drives (“Key Value Storage API 1.0”).
Status of Mission Peak and NVMe-oF
Last year we took an in-depth look at NVMe over Fabrics. Could you give us an update on this?
Arenz: Last year we introduced a complete reference design named Mission Peak and made a system available for proofs of concept at an e-shelter datacenter together with our ecosystem partners. The NVMe-oF technology used by the system is based on RDMA, which provides high performance but is also technically demanding to implement.
You can now use TCP as a fabric platform since the most recent version of the specifications (NVMe-oF 1.1) includes support for the TCP as a transport connection.
By way of a reminder: NVMe is a modern storage protocol that provides all the advantages of modern SSDs, particularly in terms of data throughput and I/O speeds. NVMe-oF helps to close the performance gap between DAS and SAN by making it possible to use SSDs in a remote host system with similar performance levels to those experienced in the local host system.
What makes NVMe/TCP particularly attractive is that it is possible to use the existing routers, switches, adapters and other standard equipment. This usually simplifies the installation process, minimizes downtime and reduces implementation costs significantly in comparison to other technologies like RDMA or fiber channels, because there is no need to acquire new equipment. That’s why we think this will become an increasingly popular approach in the storage market.
In summary, the purpose of all of these technologies is to massively increase the speed at which data can be accessed and processed?
Arenz: That’s exactly right. We are always hard at work developing innovations to ensure that established professional flash storage offerings can keep up with the ever-expanding demands of our times.