Backup and Restore YugaByteDB to Dell ObjectScale S3 Object Store
In the first post of the series, we covered Dell ObjectScale and how to install it, you can read about it, here Data volumes continue to explode across the enterprise, […]
Dell Storage, PowerStore, PowerFlex PowerMax & PowerScale, Virtualization & Containers Technologies
In the first post of the series, we covered Dell ObjectScale and how to install it, you can read about it, here Data volumes continue to explode across the enterprise, […]
In the first post of the series, we covered Dell ObjectScale and how to install it, you can read about it, here
Data volumes continue to explode across the enterprise, driven by the proliferation of mobile devices, applications, and the growth of the Internet of Things. As the amount of data businesses generate continues to rise, so does the cost of storing and managing all this information. Traditional storage solutions have become too costly and complex to deploy and manage at the scale organizations require. Public cloud solutions – once heralded as the future of data storage – can introduce unforeseen costs and data latency issues, often making them unpredictable options for long-term storage.
To harness the power of growing volumes of data, ObjectScale provides a next-generation storage solution that can help to reduce costs, simplify management, accelerate access and extract the intelligence from data that organizations need to compete.
ObjectScale is the next evolution of object storage from Dell Technologies. It is a software-defined, scale-out, object storage platform. With ObjectScale, any organization can deliver cloud-scale storage services with the reliability and control of a private cloud infrastructure.
ObjectScale features a Kubernetes-native, containerized architecture built on the principles of microservices to promote efficiency, resiliency and flexibility. Each service is completely abstracted and independently scalable with high availability and no single point of failure. ObjectScale is built on proven Dell ECS’s codebase and has been re-platformed to utilize the native orchestration capabilities of Kubernetes—scheduling, auto-scaling, load-balancing, self-healing, and more.
ObjectScale is optimized for a broad range of use cases, including:
Using object storage for backup has advantages over file and block storage. Objects are for storing and retrieving unstructured blobs of data as a whole, rather than as individual blocks. They may consist of image files, HTML pages, binaries, video, executables and user-generated content — mostly unstructured data unlikely ever to be changed.
In an object storage system, objects can reside on any number of servers, whether on premises or in the cloud. Instead of using a namespace and a directory structure, applications address objects based on an ID and a few simple HTTP API calls like PUT, UPLOAD, GET and DELETE.
What object storage lacks in versatility it makes up for in simplicity. Without the extensive overhead of file and block storage, applications like backup software can use simple requests to store and retrieve objects across greatly scaled-out storage systems.
To demonstrate ObjectScale backup and restore use case I chose YugabyteDB.
YugabyteDB is a high-performance distributed SQL database for powering global, internet-scale applications. It provides flexible query language, low-latency performance, continuous availability, and globally distributed scalability.
The traditional way to take backups in an RDBMS is to dump the data across all (or the desired set of) the tables in the database by performing a scan operation. However, in YugabyteDB – a distributed SQL database that is often considered for its massive scalability, the data set could become quite large, making a scan-based backup practically infeasible. The distributed backups and restore feature are aimed at making backups and restores very efficient even on large data sets.
Unlike traditional single-instance databases, YugabyteDB is designed for fault tolerance. By maintaining at least three copies of your data across multiple data regions or multiple clouds, it makes sure no losses occur if a single node or single data region becomes unavailable. Thus, with YugabyteDB, you would mainly use backups to:
YugabyteDB can be configured to have an ObjectScale S3 compatible Object Store as a backup target as follows:
To create new a backup, all we need is to select the appropriate Structured Query Language, the ObjectScale S3 compatible object store that we’ve configured, and the namespace, we can enable backup encryption, use tablespaces, and set the parallel threads and retention policy.
Unlike eventually consistent databases, YugabyteDB uses strongly consistent replication that’s based on Google Spanner’s per-shard distributed consensus architecture. This architecture enables very efficient backups and restores. YugabyteDB’s distributed backups read data only from the leaders of various shards and do not involve the replicas at any time in the process.
Additionally, YugabyteDB’s Log Structured Merge (LSM) storage engine design enables a very lightweight checkpoint mechanism. A checkpoint is taken for each shard at the leader without requiring any expensive logical reads of the data. The compressed on-disk storage files from the leader of each shard are copied in a steady, rate-limited manner, to the backup store with minimal impact on the foreground operations.
YugabyteDB currently supports table level backups and restores. The backup stores themselves can be a cloud object storage endpoint
By connecting to ObjectScale object store using S3 browser, we can navigate to the bucket, here we can see that a new folder has been created for our application consistent backup which contains the data as well as the metadata objects.
Restoring YugabyteDB from ObjectScale is also as simple as a click of a button, we navigate to the backup, select the restore backup option in the drop-down menu and click ok, withing a few seconds, the data is restored from the consistent backup which is store in our ObjectScale bucket
In order to provide consistent backups, we can easily configure a backup scheduler and select the backup frequency as well as the other parameter that already have been described.
ObjectScale contains predefined dashboards that visualize the collected metrics. Some of the metrics are shown in the UI, on the main ObjectScale Dashboard and the object store Dashboard pages. Administrators can inspect the reported data in more detail on the Grafana dashboards. Administrators can identify developing storage and memory problems by monitoring the dashboards. The dashboards also help identify inefficiencies and provide a way to diagnose problems.
ObjectScale for Kubernetes allowed Dell to deliver a simplified product where Kubernetes handles the OS and hardware-level layers leaving ObjectScale to handle the storage and storage management and gives you segmented control of the storage, compute, and network services and allows for dynamic provisioning of resources
A post by Tomer Eitan