Dell Technologies PoweScale OneFS S3 Overview

We have just released PowerScale which can you read about here.

One of the new features of OneFS 9.0 is the new, S3 support

Dell EMC™ PowerScale™ OneFS™ supports the native S3 protocol. With this capability, clients can access OneFS cluster file-based data as objects. PowerScale combines the benefits of traditional NAS storage and emerging object storage to provide an enhanced data-lake capability and cost-effective unstructured data storage solution. OneFS S3 is designed as the first-class protocol including features for bucket and object operations, security implementation, and management interface.
This document introduces how the S3 API is implemented in OneFS to provide high-performance data access. It introduces the benefits of OneFS S3 and provides applicable use cases. This document also details bucket and object operations, authentication, and the authorization process.

Introduction
Data is now a new form of capital. It provides the insights that facilitate your organization digital transformation, and 80% of the information is represented as unstructured data. Organizations in every industry generate an exponentially larger number of unstructured data volumes than ever before, from edge to core to cloud. The way of storing and managing unstructured data is evolving. Its goal is to unlock the value of your data by using both the traditional network attached storage (NAS) system and emerging object storage.
Many organizations run their critical applications on traditional NAS storage, and develop new modern applications using object storage. There are some challenges under the heterogeneous storage platforms for unstructured data:
• Applications running on different storage platforms may need to access a same set of data. In this case, data migration is required between NAS storage and object storage, and the extra copy of the data consumes additional storage capacity.
• There is an inability to access object storage through the NAS protocol, like NFS and SMB. Many object-storage systems provide access to the NAS protocol using a gateway-like architecture which does not perform adequately when combined with the object storage stack.
• In contrast to NAS storage, object storage is not intended for transactional data where operationsper-second and latency are critical. Designed to address these challenges, PowerScale is ideal for demanding enterprise file workloads and can store, manage, and protect unstructured data with efficiency and massive scalability. It supports multiple NAS protocols natively for applications. Starting with OneFS version 9.0, it provides the capability of accessing data through the Amazon Simple Storage Service (Amazon S3) application programing interface (API) natively. PowerScale implements the S3 API as the first class protocol along with other NAS protocols on top of its distributed OneFS file system. Whether your application is based on traditional NAS storage or the emerging object storage, the application can access data in a single scale-out storage platform through NAS protocols or the S3 API as needed. This document introduces how S3 API is implemented in OneFS and can provide high-performance data access

OneFS S3 overview
The Amazon S3 API was originally developed as the data-access interface of Amazon S3. As applications were developed using the S3 API, it became a common standard for object storage. This document refers to the S3 API for object storage as the S3 protocol. This provides a consistent nomenclature along with other NAS protocols regarding the OneFS file service.

Figure 1 shows the traditional scale-up NAS platform and the emerging object-storage architecture. The traditional scale-up NAS platform is only accessible through file protocols and is not easy to scale as the performance requirement increases. The object storage allows both file and object access, but the file access is achieved through a gateway, with either a software daemon or additional dedicated hardware. This limits file-access performance compared to a traditional NAS platform.

This is where PowerScale scale-out storage comes in. With the introduction of the OneFS S3 protocol, PowerScale combines the advantages of both platform types into a single storage system while providing performance for file and object access.
Starting with OneFS version 9.0, PowerScale OneFS supports the native S3 protocol. OneFS implements S3 as a first-class protocol along with other protocols, including NFS, SMB, and HDFS. The S3 protocol is implemented over HTTP and secure HTTP (HTTPS). Through OneFS S3, you can access file-based data stored on your OneFS cluster as objects. Since the S3 API is considered to be a protocol, content and metadata can be ingested using S3 and concurrently accessed through other protocols that are configured on the cluster.
Note: The OneFS S3 service is disabled by default. If the service is enabled, it only listens on port 9021 for HTTPS. Port 9020 for HTTP is disabled by default. These ports are configurable through the OneFS WebUI and CLI.
In OneFS 9.0, each S3 bucket is mapped to a directory under an access zone base path, each S3 object is mapped to a file, and the associated object prefix is mapped to directories. A directory for a bucket is created under the access zone base path by default. See the AWS S3 documentation regarding S3 bucket and object definition

OneFS 9.0 provides the following new S3 features:
• Create, list, update, and delete buckets
• Create, list, read, and delete objects
• Support for both AWS Signature Version 2 and Version 4
• Support both path-style requests and virtual hosted-style requests
• Multipart upload of large content for better performance
• Access ID and secret key management through WebUI, CLI, and PAPI
• Bucket ACL and Object ACL for access control
• Access zone awareness for multitenancy

Use cases
By implementing the S3 protocol, OneFS enhances its data-lake capability by supporting both the traditional NAS protocol and object storage protocol. You can unify your file and object data access in a single storage namespace. The S3 protocol on a OneFS cluster provides following benefits:
• Consolidate storage for applications regardless of data-access protocols
• Store data with the S3 protocol and then seamlessly access the data as files with SMB, NFS, HTTPS, FTP, and HDFS
• Store files with SMB, NFS, and other protocols and then access the files as objects through the S3 protocol
• Eliminate data migration when a same set of data is accessed through NAS protocols and the S3 protocol
• Multitenancy support for better storage-as-a-service abilities through S3
• Increased return on investment for the OneFS cluster by supporting object access As the Figure 2 shows, OneFS seamless interoperates between file-based and object-based data access in a single NAS platform for various workloads.

OneFS can now extend its use cases with the benefits of OneFS S3. The following list includes general use cases for OneFS S3:
• Backup and archive: It is possible to make OneFS as an ideal target for S3-compatible data backup and archive software.
• File service: Data access for files and data access for objects are easily consolidated in a single scale-out storage. This provides faster service than cloud and more cost-effective service than traditional NAS platforms.
• Enhanced multiprotocol data access: The data in a OneFS cluster can be accessed as files or objects.

OneFS S3 implementation
OneFS implements the S3 protocol on top of the file-service engine like other protocols. Clients that connect to a OneFS cluster with S3 gain access to the single volume of the distributed OneFS file system and take advantage of the entire cluster’s performance. To work with OneFS S3, clients connect to the S3 service over HTTP or HTTPS and use standard REST calls such as PUT, GET, and POST to perform bucket and object operations.

Making an analogy with an SMB share which is associated with a path, a OneFS S3 bucket is also created based on a specific path within the access zone base path. OneFS S3 maps an object to a file and maps the object prefix to directories correspondingly. For example, assume a file is stored in OneFS with a full path of /ifs/data/docs/finance/sample.pdf. To access the file with S3, create a bucket bkt01 in OneFS and associate the bucket with a path /ifs/data/docs/. The object key of /finance/sample.pdf is used to represent the file.
OneFS support two types of requests when resolving buckets and objects. See the Amazon S3 documentation Virtual Hosting of Buckets for more details about the following:
• Virtual hosted-style requests: Specify a bucket in a request by using the HTTP Host header
• Path-style requests: Specify a bucket by using the first slash-delimited component of the RequestURI path With OneFS S3, you can access the file as an object by using GET and PUT operations with the following URLs (for example) which contain the SmartConnect zone name:
• Path-style requests: https://sc.example.local:9021/bkt01/finance/sample.pdf
• Virtual hosted-style requests: https://bkt01.sc.example.local:9021/finance/sample.pdf
The path-style request is available through both the SmartConnect zone name and IP address of a node.

To use virtual hosted-style request, the following configuration is required:
• A SmartConnect zone name is required.
• The wildcard subdomain option must be enabled for the groupnet. This option is enabled by default.
# isi network groupnets modify <groupnet> –allow-wildcard-subdomains=true
• Configure the S3 base domain to your SmartConnect zone name using the WebUI or CLI.
# isi s3 settings zone modify –base-domain=<smartconnect> –zone=<name>

Buckets
OneFS requires a bucket to map to a specific directory in an access zone. This directory is called the bucket path. If the bucket path is not specified, a default path is used, which is configurable at an access zone level through the WebUI or CLI. When creating a bucket, OneFS creates a directory with a prefix of .isi_s3_ under the bucket path, and creates 16 other subdirectories named 0 through 15 under the .isi_s3_ directory. An example of this name is .isi_s3_ 1_1000000010001. The 16 subdirectories are used to store temporary files for the PUT operation. OneFS automatically balances different temporary files between the directories for better performance. Figure 4 shows the process of putting an object to the OneFS cluster which uses the temporary directory under the bucket.

Bucket naming rules
OneFS S3 bucket names comply with DNS naming conventions. The following rules are required for naming
S3 buckets in OneFS:
• Bucket names must be unique at the OneFS access zone level.
• Bucket names cannot be changed after the bucket is created.
• Bucket names must consist of characters including lowercase letters (a-z), numbers (0-9), or dashes (-).
• Bucket names must start or end with a lowercase letter (a-z) or number (0-9).
• Bucket names must be 3 to 63 characters in length.

Bucket operations
Table 1 shows the supported S3 bucket operations in OneFS 9.0. See the document Dell EMC PowerScale:
OneFS S3 API Guide on Dell.com/StorageResources for details about each supported API.

API name in AWS S3 API reference	Description
CreateBucket	PUT operation to create a bucket. Anonymous requests are never allowed to create buckets. By creating the bucket, the authenticated user becomes the bucket owner.
ListObjects	List objects in a bucket.
ListObjectsV2	List objects in a bucket.
GetBucketLocation	Returns the location as an empty string.
DeleteBucket	Delete the bucket.
GetBucketAcl	Get the access control list (ACL) of a bucket.
PutBucketAcl	Set the permissions on an existing bucket using ACLs.
HeadBucket	Determine if a bucket exists and if you have permission to access it. The operation returns a 200 OK if the bucket exists and if you have permissions to access it. Otherwise, the operation might return responses such as 404 Not Found and 403 Forbidden.
ListBuckets	Get a list of all buckets owned by the authenticated user of the request.
ListMultipartUploads	List in-progress multipart uploads. An in-progress multipart upload is a multipart upload that has been initiated using the Initiate Multipart Upload request but has not yet been completed or aborted.

Objects
The AWS S3 data model is a flat structure—you create a bucket, and the bucket stores objects. There is no hierarchy of subbuckets or subfolders. However, AWS S3 provides a logical hierarchy using object key name prefixes and delimiters to support a concept of folders. For example, instead of naming an object sample.jpg in the bucket named examplebucket, you can name it photos/samples/sample.jpg. The photos/samples/becomes the object key name prefix using the slash (/) as delimiter.
OneFS maps objects to files and creates directories as object key name prefixes implicitly. The OneFS file
system requires the following rules for object naming :
• The object can only use ASCII or UTF-8 characters.
• Only the slash (/) is supported as an object-key-name delimiter. The slash cannot be part of the object key name; it is automatically treated as a delimiter.
• Objects cannot contain a prefix that conflicts with an existing object key name in its path. For example, creating the object /document and /document/sample.docx is not allowed within the same bucket.
• You cannot use a period (.) or double period (..) as object key name or part of object key name prefix.
For example, you can create the object /.sample.jpg but not the object /./sample.jpg.
• You cannot use .snapshot as the object key name or part of an object key name prefix; this is reserved for OneFS SnapshotIQ.