Somebody today asked if GlusterFS could be made as fast as a local filesystem. My answer just came out without a ton of thought behind it, but I found it rather profound.

Comparing (any) clustered filesystem to a local filesystem is like

comparing apples and orchards

You can reach up and quickly grab an apple, eat it, and its purpose is served. But then you look at the other apples in the orchard and they’re not nearly as easy to use. If you wanted an apple from that tree over there, it might require considerable walking (increased latency). The aggregate performance of picking all the apples in the orchard will most certainly not be the same as reaching up and picking the apple on your local branch.

However…

If your goal is not to feed just yourself, but a thousand people, you look at the ability to complete the whole job. If you had to feed them from your local tree, it would take a considerably long time. Just picking and distributing the apples from the one tree would take a long time.

In the orchard, though, you could have them disbursed to a multitude of trees. Each person could reach up and pick an apple. The scaled performance would far exceed the performance of just one local tree.

Consider your total workload

Performance of one thread reading and writing to one file is not going to be as fast. But what about thousands of simultaneous file accesses? Millions? Scale must be thought of all at once. Don’t get stuck micro engineering.