Gluster, Scale Disperse Volume by One Node – Part-1

I need to scale out out gluster disperse volume to increase storage capacity, however, our IT department does not have budget and infrastructure to provide more than one server.
If you are one of the gluster user who have landed in this situation, read on. This post has solution to scale your disperse (erasure coded) volume just by one node/server.

When we setup a disperse volume of configuration, say 4+2, we have to make sure that at max 2 (redundancy count) bricks should be placed on one server to provide redundancy on node level. The best setup would be to host one brick per server. However, for a reasonable fault tolerant disperse volume we need at least 3 servers. For this blog, let’s assume we have 3 node setup.
Let’s say we created a disperse volume of 3 * (4+2) where 6 bricks are on each server and we have 3 disperse sub volumes . For each sub volume 2 bricks are on different node.

After using this volume for some time, we reached the maximum supported capacity and now we want to increase the storage capacity of this volume. We could just add (4+2 = 6 bricks) on this volume using “gluster add-brick” command but it can not be done as all the slots for disks on old servers are occupied, we could not attach any additional disks on existing node. We ask our IT team to provide 2 new disks attached to 3 new servers each but all they provided is one server and 6 disks attached to it.

If we add all these 6 bricks on existing EC volume, the new sub volume will have all the bricks on one node which could be very dangerous. if we loose connection to that node or node crashes, we will endup loosing whole sub volume and data stored on that sub volume.

So we have to move some of the disks placed on new node to existing old nodes so that we have proper distribution of the disks.

There are two approaches to perform this migration of bricks –
1 – Software driven movement of data using replace-bricks
2 – Physical movement of disks supported by external tool

1 – Software driven movement of data using replace-bricks
In this approach a module (yet to be implemented) in gluster will identify the disks which are required to be replaced and data movement will be triggered. This data migration might happen using gluster heal feature or some other data copying methods like xfs_copy or rsync. Although this approach does not require any human interaction after adding a new node, this is considerably a slower approach if there is huge amount of data on volume. This data movement will happen over the wire which is a time consuming process.

2 – Physical movement of disks supported by external tool
In this approach, we will physically swap the disks between old servers and new servers. This physical swapping of the disks will be done manually and will be supported by a tool which will identify and inform the user about the disks need to be swapped. The detailed steps and functionality of this approach will be explained in Part -2 of this blog…

Gluster, Scale Disperse Volume by One Node – Part-1

Published by Ashish

Leave a comment Cancel reply

Share this:

Related

Published by Ashish

Leave a comment Cancel reply