In Oct 2021, Dell announced that it was “ALL IN” on NVMe/TCP (NVMe over TCP/IP) for Dell storage products. Not by chance, VMware had a similar announcement stating support for […]
End-users continue to demand high application performance to perform their tasks in today’s rapidly changing business environment. Organizations initially addressed this issue by implementing NVMe SSD-based storage to mitigate storage performance as a potential bottleneck. However, scaling became an issue, as these performance gains required NVMe to be implemented as direct-attached storage (DAS) for every server host. Because NVMe acted as DAS, organizations potentially had unused storage that could not be easily shared by multiple hosts.
To achieve scalability of NVMe storage while maintaining high levels of performance, organizations began implementing NVMe over Fabrics (NVMe-oF) technology. NVMe-oF extends the NVMe protocol to transport protocols such as Fibre Channel (FC) and remote direct memory access (RDMA), facilitating faster connectivity between storage and servers. However, implementing NVMe-oF using such protocols can increase both network configuration costs and complexity.
PowerStore was built from the ground up for modern storage media. In the initial release, NVMe was used within the appliance, while over the network, the transport mechanism was standard SCSI protocol.
PowerStoreOS 2.0 extended NVMe benefits across the network with NVMe-over-Fibre Channel support. PowerStoreOS 2.1 now extended NVMe benefits across the network with NVMe-over-TCP support.
NVMe over TCP is much more efficient, parallel and scalable than SCSI. It’s designed to make an external networked array feel like direct attached storage to hosts.
This support allows almost all the benefits of NVMe/FC, while radically simplifying the networking requirements.
The best thing about this release is how easy it is to activate the new capability.
- It’s a simple software update ou won’t need any additional PowerStore hardware.
- Any hosts that support NVMe can now use that protocol to talk to PowerStore, while SCSI hosts continue as if nothing changed. PowerStore and the network itself can handle both protocols.
In Oct 2021, VMware announced support of the NVMe/TCP storage protocol with the release of VMware vSphere 7 Update 3,
NVMe/TCP allows vSphere customers a fast, simple and cost-effective way to get the most out of their existing storage investments. With this release, VMware continues to enable support for the latest technology innovations in hardware, with VMware vSphere
So, first of all, what is NVMe?
NVMe is a non-uniform memory access (NUMA) optimized and highly scalable storage protocol that connects a host to a solid-state memory subsystem. Since NVMe is NUMA-optimized, multiple CPU cores can share the ownership of queues and individual read and write
commands. As such, NVMe SSDs can scatter and gather map commands and process them order optimized to lower command completion latency.
This is the target manager on the array, A controller is associated with one or several NVMe namespaces and provides an access path between the ESXi host and the namespaces in the storage array. To access the controller, the host can use two mechanisms, controller discovery and controller connection.
In the NVMe storage array, a namespace is a storage volume backed by some quantity of non-volatile memory. In the context of ESXi, the namespace is analogous to a storage device, or LUN. After your ESXi host discovers the NVMe namespace, a flash device that represents the namespace appears on the list of storage devices in the vSphere Client. You can use the device to create a VMFS datastore and store virtual machines.
Namespace ID (NSID)
The namespace ID is used as an identifier for a namespace from any given controller. Once again, this equates to a Logical Unit Number (LUN) with SCSI-based storage.
Asymmetric Namespace Access (ANA)
Asymmetric Namespace Access (ANA) is an NVMe standard that was implemented as a way for the target to inform an initiator of the most optimal way to access a given namespace.
is the establishment of multiple physical routes between a server and the storage device that supports it. This is done to prevent Single Point of Failure and achieve continuous operations.
NVMe Qualified Name (NQN)
The NVMe Qualified Name (NQN) is used to uniquely identify the remote target or initiator, it is similar to an iSCSI Qualified Name (IQN)
Users can setup NVMe host through:
- PowerStore Manager
- PowerStore CLI
Set up Fibre Channel Front End ports (zoned)
Create Host or Host Groups and select NVMe as protocol
- Add initiator(s)
- nqn is the NVMe identifier similar to the iqn for iSCSI
Create Volume/Thin Clone or Volume Groups
- Not supported with vVol
Map the NVMe Host to the Volume(s)
So now let’s see how to configure it with PowerStore
In the vSphere Client, navigate to the ESXi host, Click the Configure tab, Under Storage, click Storage Adapters, and click the Add Software Adapter icon.
Select the adapter type as required, NVMe over TCP adapter, and select the TCP network adapter (vmnic) from the drop-down menu.
To create a new NVMe host in PowerStore, you need to click on the Add Host button and select the NVMe Initiator Type
Next, you need to create a volume and map it to the host you’ve just created.
Upon Volume creation NVMe Unique IDs are allocated (in addition to SCSI wwn):
- NSID – Volume ID on host perspective
NGUID – NVMe Global Unique Identifier (equivalent to SCSI wwn)
- Both IDs assigned internally by the system
- Both IDs assigned internally by the system
With NVMe there’s no need for rescans, NVMe has Async evens which allows the array to instantly inform a host of a new storage, resize, etc.
In vSphere, you will automatically see the new volume appear as a namespace, In the figure below are the details on devices, paths, namespaces, and controllers is available.
You can also see more information about the volume under the standard storage device pane
Now, the volume can be used to create a VMFS datastore.
Pluggable Storage Architecture (PSA)
To manage storage multipathing, ESX/ESXi uses a special VMkernel layer, Pluggable Storage Architecture (PSA). The PSA is an open modular framework that coordinates the simultaneous operation of multiple multipathing plugins (MPPs). PSA is a collection of VMkernel APIs that allow third party hardware vendors to insert code directly into the ESX storage I/O path. This allows 3rd party software developers to design their own load balancing techniques and failover mechanisms for particular storage array. The PSA coordinates the operation of the NMP and any additional 3rd party MPP.
Native Multipathing Plugin (NMP)
The VMkernel multipathing plugin that ESX/ESXi provides, by default, is the VMware Native Multipathing Plugin (NMP). The NMP is an extensible module that manages subplugins. There are two types of NMP subplugins: Storage Array Type Plugins (SATPs), and Path Selection Plugins (PSPs). SATPs and PSPs can be built-in and provided by VMware, or can be provided by a third party.
By default, the native multipathing plug-in (NMP) supplied by VMware is used to manage I/O for non -NVMe devices. NMP is not supported for NVMe. VMware uses a different plug-in called the High-Performance Plug-in or HPP.
High-Performance Plug-in (HPP)
VMware provides the High-Performance Plug-in (HPP) to improve the performance of storage devices on your ESXi host.
The HPP replaces the NMP for high-speed devices, such as NVMe. The HPP is the default plug-in that claims NVMe targets. Within ESXi, the NVMe targets are emulated and presented to users as SCSI targets. The HPP supports only active/active and implicit ALUA targets.
Path Selection Schemes (PSS)
The High-Performance Plug-in uses Path Selection Schemes (PSS) to manage multipathing just as NMP uses PSP. HPP offers the following PSS options:
Fixed – Use a specific preferred path
LB-RR (Load Balance – Round Robin) – this is the default PSS. After 1000 IOPs or 10485760 bytes (whichever comes first), that path is switch in a round robin fashion. This is the equivalent of NMP PSP RR.
LB-IOPS (Load Balance – IOPs) – When 1000 IOPs are reached (or set number), VMware will switch paths to the one that has the least number of outstanding IOs.
LB-BYTES (Load Balance – Bytes) – When 10 MB are reached (or set number), VMware will switch paths to the one that has the least number of outstanding bytes.
Load Balance – Latency (LB-Latency) – this is the same mechanism available with NMP, VMware evaluates the paths and decides which one has the lowest latency.
HPP can be managed in vSphere Client as well as esxcli commands, we recommend using the default LB-RR policy and change the IOPS per path from 1000 to 1:
Below you can see a demo showing how it all works:
whether or not you have immediate plans for NVMe-over-TCP, this capability shows the flexibility and investment protection PowerStore provides. As Dell continues to lead the adoption of new technologies, your roadmap is in good hands with the adaptable PowerStore platform.
A guest post by Tomer Nahumi