Isilon – The Challenge of Files at Scale
I get a lot of requests to post about iSilon so I hooked up with Ron Steinke, A Technical Staff member of the iSilon Software Engineering to write some guest […]
Dell Storage, PowerStore, PowerFlex PowerMax & PowerScale, Virtualization & Containers Technologies
I get a lot of requests to post about iSilon so I hooked up with Ron Steinke, A Technical Staff member of the iSilon Software Engineering to write some guest […]
I get a lot of requests to post about iSilon so I hooked up with Ron Steinke, A Technical Staff member of the iSilon Software Engineering to write some guest posts, I would really appreciate the feedback and whether you would like us to write more about iSilon
Scalability is a harder problem for stateful, feature rich solutions. Distributed filesystems are a prime example of this, as coordinating namespace and metadata updates between multiple head nodes presents a challenge not found in block or object storage.
The key is to remember that this challenge must be viewed in the context of the simplification it brings to application development. Application developers choose to use files to simplify the development process, with less concern about what this means for the ultimate deployment of the application at scale. For a process limited by the application development lifecycle, or dependent on third party applications, the tradeoff of utilizing a more complex storage solution is often the right one.
Part of the challenge of file scalability is fully replicating the typical file environment. Any scalability solution which imposes restrictions which aren’t present in the development environment is likely to run against assumptions built into applications. This leads to major headaches, and the burden of solving them usually lands on the storage administrator. A few of the common workarounds for a scalable flat file namespace illustrate these kinds of limitations.
One approach is to have a single node in the storage cluster managing the namespace, with scalability only for file data storage. While this approach may provide some scalability in other kinds of storage, it’s fairly easy to saturate the namespace node with a file workload.
A good example of this approach is default Apache HDFS implementation. While the data is distributed across many nodes, all namespace work (file creation, deletion, rename) is done by a single name node. This is great if you want to read through the contents of a large subset of your data, perform analytics, and aggregate the statistics. It’s less great if your workload is creating a lot of files and moving them around.
Another approach is namespace aggregation, where different parts of the storage array service different parts of the filesystem. This is effectively taking UNIX mount points to their logical conclusion. While this is mostly transparent to applications, it requires administrators to predict how much storage space each individual mount point will require. With dozens or hundreds of individual mount points, this quickly becomes a massive administration headache.
Worse is what happens when you want to reorganize your storage. The storage allocations that were originally made typically reflect the team structure of the organization at the time the storage was purchased. Organizations being what they are, the human structure is going to change. Changing the data within a single mount point involves renaming a few directories. Changes across mount points, or the creation of new mount points, involve data rewrites that will take longer and longer as the scale of your data grows.
Clearly these approaches will work for certain kinds of workflows. Sadly, most storage administrators don’t have control of their users’ workflows, or even good documentation of what those workflows will be. The combination of arbitrary workflows and future scaling requirements ultimately pushes many organizations away from limited solutions.
The alternative is a scale-out filesystem, which looks like a single machine both from the users’ and administrators’ perspective. A scale-out system isolates the logical layout of the filesystem namespace from the physical layout of where the data is stored. All nodes in a scale-out system are peers, avoiding specials roles that may make a particular node a choke point. This parallel architecture also allows each scale-out cluster to grow to meet the users’ needs, allowing storage sizes far larger than any other filesystem platform.
There are four main requirements to provide the transparency of scale-out: