Wide distribution of Data for Massive Performance

Flex widely distributes data across all storage resources in the cluster, which eliminates the architectural problems of other IP-based storage systems.  With VxFlex OS, ALL of the IOPS and bandwidth of the underlying infrastructure are realized by a perfectly balanced system with NO hot spots.

Massive Availability and Resiliency

Flex has a self-healing architecture that employs many to many, fine-grained rebuilds, which is much different than the serial rebuilds seen with most storage products. When hardware fails, data automatically rebuilt using all other resources in the cluster. This enables a 6×9’s availability profile while using x86 commodity hardware.  Flex can rebuild an entire node with 24 drives in mere minutes – a fraction of the time it takes to rebuild a single drive on a traditional array.

Built In Multipathing

Flex automatically distributes traffic across all available resources. Every server can be a target as well as an initiator.  This means as you add/remove nodes in the cluster, multipathing is dynamically updated on the fly.  Inherent, dynamic built-in multipathing.

Dell EMC VxFlex Ready Nodes converge storage and compute resources into a single layer architecture, aggregating capacity and performance with simplified management capable of scaling to over a thousand nodes. VxFlex OS provides the maximum in flexibility and choice. VxFlex OS supports high performance databases and applications,
at extreme scale (again from as little as 3 nodes to over 1000 per cluster) and
supports multiple OS, Hypervisors or Media. You can build the infrastructure that best supports your applications. Choose your hardware vendors or use what you already have in house.

TWO LAYER-

  • Similar structure to traditional SAN
  • Supports organizations who prefer separation between storage and application teams
  • Allows scaling of storage needs separately from the application servers
  • New 100Gb Switch for aggregation layer

Or

HCI

  • Provides maximum flexibility and easier to administrate
  • Servers host both applications and storage
  • Modern approach to manage IT Data Center
  • Provides maximum flexibility and easier to administrate
  • Maintenance of servers impact both storage and compute

In a Storage-only Architecture:

The SDC exposes VxFlex OS shared block volumes to the application.

  • Access to OS partition may still be done “regularly”
  • VxFlex OS data client (SDC) is a block device driver

The SDS owns local storage that contributes to the VxFlex OS storage pool

  • VxFlex OS data server (SDS) is a daemon / service

IN two-layer, SDC and SDS run on different nodes and can grown independent of each other

We have just released VxFlexOS 3.0 which includes many asked features, here’s what’s new

 

Fine Granularity (FG) Layout

Fine Granularity layout (FG) – a new, additional storage pool layout using a much finer storage allocation units of 4KB. This is in addition to existing Medium Granularity (MG) storage pools using a 1MB allocation units.

4KB allocation unit allows better efficiency in thin-provisioned volumes and snapshots. For customers that frequently use snapshots, this layout will create significant capacity savings.

Note: FG storage pools require nodes with NVDIMMs and SSD/NVMe media type

Inline compression – Fine Granularity layout enables data compression capability that can reduce the total amount of the physical data that needs to be written to SSD media. Compression saves storage capacity by storing data blocks in the most efficient manner, when combined with VxFlex OS snapshot capabilities, can easily support petabytes of functional application data.

Persistent Checksum – In addition to the ‘inflight checksum’ available, persistent checksum presents an added data integrity for the data and metadata of FG storage pools.  Background scanners monitor the integrity of the data and metadata over time.


VxFlex OS 3.0 introduces an ADDITIONAL, more space efficient storage layout

Existing – Medium Granularity (MG) Layout
Supports either thick or thin-provisioned volumes
Space allocation occurs at 1MB units
No attempt is made to reduce the size of user-data written to disk (except with all-zero data)
Newly Added – Fine Granularity (FG) Layout
Supports only thin-provisioned, “zero-padded” volumes
Space allocation occurs at finer 4KB units
When possible, reduces actual size of user-data stored on disk
Includes Persistent Check-summing for data integrity
A Storage Pool (SP) can be either a FG or MG type
FG storage pools can live alongside MG pools in a given SDS
Volumes can be migrated across the two layouts (MG volumes zero padded)
FG pools require SSD/NVMe media and NVDIMM for acceleration

Inline Compression

Compression algorithms in general
What’s desirable is something off-the shelf, standard, field-proven
Lempel-Ziv (LZ) based compression (recurring patterns within preset windows) w/wo Huffman coding
Very good for compression: Text (>80%), DB (~70% [ranges from 60% – 80%] )
The algorithm used in VxFlex OS 3.0 is C-EDRS,
DellEMC proprietary (similar to LZ4). The same algorithm that XtremIO uses
Good balance of compression ratio and performance (light on CPU)
We test compressibility in-line (on the fly)
Some data is not a good candidate for compression (e.g. videos, images, compressed DB rows)
Invest CPU cycles up front. If not reducible more than 20%, consider incompressible and store uncompressed
Don’t waste CPU cycles later decompressing read IOs


Persistent Checksum
Logical Checksum (protects the uncompressed data)
All data written to FG pools, with or without compression, have a logical checksum always calculated by default (cannot be changed)
If we compress the data: the checksum of the original (uncompressed) data is calculated before being compressed and written to the
disk and is stored on disk with the data *
If the data is not compressed (by user selection or because of incompressibility), the checksum is calculated and stored elsewhere **
Physical Checksum (protects the compressed data)
Protects the integrity of the Log itself
Computed for the Log, after placing Entries into the Log
Computed over the compressed data and the embedded metadata
thus protects the integrity of the compressed data
Metadata Checksum
Maintaining the integrity of the metadata itself is crucial
Cannot reconstruct metadata from (compressed) user data
Disk level metadata
There is a checksum for each physical row in the metadata that lives on each disk
If we detect an error in the metadata, we do not trust anything on the disk and trigger a rebuild

Background Device Scanner


Scans devices in the system for errors
You can enable/disable the Background
Device Scanner & reset its counters
MG storage pools
Disabled by default
No changes, same as 2.x
FG storage pools
Enabled by default
Mode: device_only – report and rebuild on error
Cycle through each SSD and compare the Physical
Checksums
against the data in the Logs and Metadata
GUI controls/limits disk IO – default is 1024 KB/s per device

Choose the best Layout for each workload
being able to choose for each workload
the layout that works best for you
MG
Workloads with high performance requirements and sensitivity
All of our usual use cases still apply
FG compressed
A great choice for most cases where data is compressible
And where space efficiency is more valuable than raw IO
Esp. when there is snapshot usage / requirements
DevOps and Test/Dev environments
FG non-compressed
Data isn’t compressible (e.g. OS or application-level encryption)
But you use lots of snapshots & want the space savings
Read-intensive workloads w/ >4K IOs
Need persistent checksums
And change your mind… You can migrate “live” to another

on-Disruptive
Volume Migration

Prior to v3.0, a volume is bound to a Storage Pool on creation and this binding cannot be later changed
There are various use cases:
Migrating volumes between different performance tiers
Migrating volumes to a different Storage Pool or Protection Domain driven by multi-tenancy needs
Extract volumes from a deprecating Storage Pool or Protection Domain to shrink a system
Change a volume personality
Thin
-> Thick
FG
-> MG
Migrating volumes from one Storage Pool to another
V-Tree granularity – volume and all snapshots are migrated together
Non-disruptive to ongoing IO, hiccups are minimized
Migration supported across
Storage Pools within the same Protection Domain
Storage Pools across Protection Domains
Supports older v2.x SDCs

Snapshots and Snapshot Policy Management

Volume Snapshots
Prior to v3.0, there was a limit of 32 items in a volume tree (V-Tree)
31 snapshots + root volume
In v3.0, this is increased to 128 (for both FG and MG layouts)
127 snapshots + root volume
Snapshots in FG are more space efficient and have better performance
In comparison to MG snapshots
4KB block management, 256x less to manage with each subsequent write
Remove Ancestor snapshot
Ability to remove the parent of a snapshot and maintain the snapshot in the system
In essence merging the parent to a child snapshot
Policy managed snapshots
Up to 60 policy-managed snapshots per root volume (taken from the 128 total available)

Snapshot Policy
The policy is hierarchical
For example, we would like to keep:
An hourly backup for the most recent day
A daily backup for a week
A weekly backup for 4 weeks
Implementation is simplified – set the basic
cadence, and the number snapshots to
keep at each level
The number of snapshots to keep is the same as
the rate of elevating the snapshot to the next level
Max retention levels = 6
Max snapshots retained in a policy = 60



Auto Snapshot Group
The snapshots of an auto snapshot group


Consistent (unless mapped)
Share the same expiration and should be deleted at the same time (unless locked)
Auto Snapshot Group is NOT a Snapshot Consistency Group
A single snapshot CG may contain several auto snapshot groups
Snapshot CGs are not aware of locked snapshots
Therefore deleting snapshot CGs which contain auto snapshots is blocked
Auto snapshot is a snapshot which
was created by a policy
The auto snapshot group is an internal
bject not exposed to the user
Hinted when snapshots are grouped by date/time in several views

Updated System Limits

Maximum SDS capacity has increased from 96TB to 128TB
Maximum SDS per PD has increased from 128 to 256
Maximum snapshot count per source volume is now 128 (FG/MG)
Fine Granularity (FG)
Maximum allowed compression ratio: 10x
Maximum allowed overprovisioning: 10x (compare vs. 5x in MG thin-provisioned)
SDC limitation in vSphere
6.5 & 6.7 – up to 512 mapped volumes
6.0 – up to 256 mapped volumes

Updates and Changes

VxFlex OS 3.0
Added Support for native 4Kn sector drives
Logical Sector size & physical Sector size fields
Removed
Windows backend (SDS,MDM) support
No support for Windows HCI, only compute nodes (SDC)
AMS Compute feature support
Security updates:
Java: Enable newer versions of Java 8 builds
CentOS 7.5 SVM passed NESSUS security scanning and STIG
New mapping required to define disk type in a storage pool (SSD/HDD)
Attempt to add disks that are not of the correct type will be blocked
Existing SPs will need to be assigned a Media type post upgrade
Transition to CentOS 7.5 Storage VM
New 3.0 SVM installations only
Replace SVM from SLES11.3/12.2 to CentOS 7.5 will be available after 3.0 (3.0.x)

OS Patching
New Ability in the IM\GW to run a user provided script on a VxFlex OS system as part of an
orchestrated, non-disruptive process (like NDU), which is usually intended for OS patching
Supported on RHEL and SLES
Using this feature includes two main steps
User should manually copy the script file to each vxFlex OS host using those prerequisites:
1. Main script name must be patch_script (we check result code is 0 at the end of execution)
2. Verification script name must be verification_script (we check result code is 0 at the end of execution)
3. The script must be copied to ~/lia/bin folder and add execution permissions
RC codes are saved in the LIA log and an error is returned if needed to the GW
1. User execute the scripts from IM\GW UI
2. It’s the customer responsibility to test the patch_script and verification_scripts prior to running the process
via GW


Execution steps:
1. Login to IM\Gateway web view
2. Select “Maintain Tab”
3. Enter MDM IP & Credentials
4. Under “System Logs & Analysis” select “Run Scrip On Host”

OS Patching
1. Run Script on Host window open
2. Select the scope of running the script/s on
Entire System – All vxFlex OS Nodes
“In parallel on different Protection Domains” – By default the script is
running on first host’s PD then move to the second and so on. By
selecting this option, the patch_script will run in parallel on all PDs.
Protection Domain : Specific PD
Fault set : specific fault set
SDS : single node
Note: PD’s that don’t have MDM’s will be first , and cluster Nodes will be
last.
3. Define “Running configuration” parameters
Stop process on script failure
Script Timeout: How much time to wait for the script to finish
Verification Script: Do you want to run verification_script after
patch_script was run
Post script action: Do you want to reboot the host after patch_script
executed.
If reboot selected – patch_script will run
à Reboot à verification_script
will run


Press “Run script on Hosts”, Validate phase will start
This phase sends a request to each of the host’s LIA the verify the existence of patch_script and
verification_script (if selected) files under ~/lia/bin
Press “Start execution phase” button
IM will make some verifications: check no filed capacity, check spare capacity, check cluster is in valid
state and no other SDS is in maintenance mode.
Enter SDS to maintenance mode , Run the patch_script
Reboot host (If required)
Run verification script (If required)
Exit from maintenance mode
Operation completed
After successful run the patch_script file is deleted and backup file of it created on the same
folder with the name backup_patch_script
OS Patching
Configuration Step III


Multi LDAP Servers Support
Deploy GW as usual, post Deploy use FOSGWTool tool to add LDAP servers to the GW login
authority
Support up to 8 LDAP servers
We use the same method as configuring a single LDAP server to support multiple (change in
command syntax)
New capability in FOSGWTool, to add multiple LDAP servers
Details will be available in the LDAP TN
Log file is the same as Gateway logs (operations.log, scaleio.log and scaleio-trace.log)
Usual errors are related to syntax of commands or networking misconfiguration

GW Support in LDAP for LIA


New ability to deploy system with LIA user already configured to use LDAP
In Deployment you can configure the first LDAP server
In Query phase the communication to LDAP is performed to validate the info
So install will not proceed until the LDAP check has passed
Post upgrade ability to switch LIA from local user to LDAP user
Ability to Add \ Remove up to 8 LDAP servers
Check is done during add or Remove, any error will fail the operation
.

You can download VxFlex OS v3.0 from the link below (click the screenshot)


And you can download the documentation from here


You can also watch a video below, showing the new compression and snapshots functionalities

 

 

Leave a Reply Cancel reply