In the first post of the series, we covered Dell ObjectScale and how to install it, you can read about it, here

Data volumes continue to explode across the enterprise, driven by the proliferation of mobile devices, applications, and the growth of the Internet of Things. As the amount of data businesses generate continues to rise, so does the cost of storing and managing all this information. Traditional storage solutions have become too costly and complex to deploy and manage at the scale organizations require. Public cloud solutions – once heralded as the future of data storage – can introduce unforeseen costs and data latency issues, often making them unpredictable options for long-term storage.

To harness the power of growing volumes of data, ObjectScale provides a next-generation storage solution that can help to reduce costs, simplify management, accelerate access and extract the intelligence from data that organizations need to compete.

ObjectScale is the next evolution of object storage from Dell Technologies. It is a software-defined, scale-out, object storage platform. With ObjectScale, any organization can deliver cloud-scale storage services with the reliability and control of a private cloud infrastructure.

ObjectScale features a Kubernetes-native, containerized architecture built on the principles of microservices to promote efficiency, resiliency and flexibility. Each service is completely abstracted and independently scalable with high availability and no single point of failure. ObjectScale is built on proven Dell ECS’s codebase and has been re-platformed to utilize the native orchestration capabilities of Kubernetes—scheduling, auto-scaling, load-balancing, self-healing, and more.

ObjectScale is optimized for a broad range of use cases, including:

  • Secondary storage. ObjectScale is an excellent option for secondary or tiered storage, enabling organizations to move infrequently accessed data away from more expensive primary storage.
  • Modern applications. Designed for modern application development, management, and analytics, ObjectScale supports next-gen web, mobile and cloud applications.
  • Data Lake. ObjectScale establishes a data lake foundation for organizations of any size, maximizing the value of user data with powerful HDFS services which enables in-place analytics capabilities that reduce risk, resources and time-to-results.
  • Storage for Internet of Things. With no limits on the number of objects, the size of objects or metadata, ObjectScale is the ideal platform to store IoT data.
  • Global content repository. ObjectScale enables any organization to consolidate multiple storage systems into a single content repository that is globally accessible.
  • Geo-protected archive. ObjectScale can serve as a secure and affordable on-premises cloud for archival and long-term retention purposes.
  • Video surveillance evidence repository. ObjectScale makes for a low-cost landing area or secondary storage site for video surveillance data, which has a high-capacity footprint per file.

Using object storage for backup

Using object storage for backup has advantages over file and block storage. Objects are for storing and retrieving unstructured blobs of data as a whole, rather than as individual blocks. They may consist of image files, HTML pages, binaries, video, executables and user-generated content — mostly unstructured data unlikely ever to be changed.

In an object storage system, objects can reside on any number of servers, whether on premises or in the cloud. Instead of using a namespace and a directory structure, applications address objects based on an ID and a few simple HTTP API calls like PUT, UPLOAD, GET and DELETE.

What object storage lacks in versatility it makes up for in simplicity. Without the extensive overhead of file and block storage, applications like backup software can use simple requests to store and retrieve objects across greatly scaled-out storage systems.

To demonstrate ObjectScale backup and restore use case I chose YugabyteDB.

YugabyteDB is a high-performance distributed SQL database for powering global, internet-scale applications. It provides flexible query language, low-latency performance, continuous availability, and globally distributed scalability.

The traditional way to take backups in an RDBMS is to dump the data across all (or the desired set of) the tables in the database by performing a scan operation. However, in YugabyteDB – a distributed SQL database that is often considered for its massive scalability, the data set could become quite large, making a scan-based backup practically infeasible. The distributed backups and restore feature are aimed at making backups and restores very efficient even on large data sets.

Unlike traditional single-instance databases, YugabyteDB is designed for fault tolerance. By maintaining at least three copies of your data across multiple data regions or multiple clouds, it makes sure no losses occur if a single node or single data region becomes unavailable. Thus, with YugabyteDB, you would mainly use backups to:

  • Recover from a user or software error, such as accidental table removal.
  • Recover from a disaster scenario, like a full cluster failure or a simultaneous outage of multiple data regions. Even though such scenarios are extremely unlikely, it’s still a best practice to maintain a way to recover from them.
  • Maintain a remote copy of data, as required by data protection regulations.

YugabyteDB can be configured to have an ObjectScale S3 compatible Object Store as a backup target as follows:

  1. Navigate to Configs > Backup > Amazon S3.
  2. Click Create S3 Backup to access the configuration form shown in the following screenshot:

  1. Use the Configuration Name field to provide a meaningful name for your backup configuration.
  2. Enable IAM Role to use the YugabyteDB Anywhere instance’s Identity Access Management (IAM) role for the S3 backup.
  3. If IAM Role is disabled, enter values for the Access Key and Access Secret fields.
  4. Enter values for the S3 Bucket and S3 Bucket Host Base fields.
  5. Click Save.

To create new a backup, all we need is to select the appropriate Structured Query Language, the ObjectScale S3 compatible object store that we’ve configured, and the namespace, we can enable backup encryption, use tablespaces, and set the parallel threads and retention policy.

Unlike eventually consistent databases, YugabyteDB uses strongly consistent replication that’s based on Google Spanner’s per-shard distributed consensus architecture. This architecture enables very efficient backups and restores. YugabyteDB’s distributed backups read data only from the leaders of various shards and do not involve the replicas at any time in the process.

Additionally, YugabyteDB’s Log Structured Merge (LSM) storage engine design enables a very lightweight checkpoint mechanism. A checkpoint is taken for each shard at the leader without requiring any expensive logical reads of the data. The compressed on-disk storage files from the leader of each shard are copied in a steady, rate-limited manner, to the backup store with minimal impact on the foreground operations.

YugabyteDB currently supports table level backups and restores. The backup stores themselves can be a cloud object storage endpoint

By connecting to ObjectScale object store using S3 browser, we can navigate to the bucket, here we can see that a new folder has been created for our application consistent backup which contains the data as well as the metadata objects.

Restoring YugabyteDB from ObjectScale is also as simple as a click of a button, we navigate to the backup, select the restore backup option in the drop-down menu and click ok, withing a few seconds, the data is restored from the consistent backup which is store in our ObjectScale bucket

In order to provide consistent backups, we can easily configure a backup scheduler and select the backup frequency as well as the other parameter that already have been described.

Grafana Dashboards

ObjectScale contains predefined dashboards that visualize the collected metrics. Some of the metrics are shown in the UI, on the main ObjectScale Dashboard and the object store Dashboard pages. Administrators can inspect the reported data in more detail on the Grafana dashboards. Administrators can identify developing storage and memory problems by monitoring the dashboards. The dashboards also help identify inefficiencies and provide a way to diagnose problems.

ObjectScale for Kubernetes allowed Dell to deliver a simplified product where Kubernetes handles the OS and hardware-level layers leaving ObjectScale to handle the storage and storage management and gives you segmented control of the storage, compute, and network services and allows for dynamic provisioning of resources

A post by Tomer Eitan

Leave a Reply