Some days I wish I had spare hardware of decent size that I could just do testing on.
4x Machines, at least two of them identical in all aspects.
Of these, two would be set up to use Gluster, one would be load generator, and one would be load tester.
Some numbers that I want, just as comparision:
glusterfs fuse mount, replica 1 , stripe1 , 2 nodes, no mdadm raid.
Raw disk performance with 4, 8, 16, 32, 48 load threads, single client.
2 tests, Random read/writes ( majority read, but lots of them over big space) and a majority big-block write.
IO/s (IOPS) and Throughput being the most interesting ones.
nfs mount, same test. Comparision.
The same test but with each glusterfs brick being on raid1 (mdadm)
Add a fs-cache on SSD for the client, and compare the same numbers.
compare raid 1 and raid 10 ( 2 disk vs. 4 disk ) on the storage nodes.
Next up, the "interesting" tests.
Create 4 virtual machines, three on the "load generator" machine, 1 on the "measure" machine. On the load generator, have one machine do random interval of "sleep: read-heavy: write-heavy: sleep" and two others doing mostly idle, but scattered /random reads with small writes.
With this as "background load", wait until the system stabilizes, and
once again measure the threaded abilities on the other machine. Watch
With this background load, which is faster, raid1 vs. raid10? How do threading options in the cluster affect this? Read-ahead vs. cache?
How can you tune it to avoid high latency bursts?
How can you _Warn_ and know that the latency is about to hit the ceiling? How much load can be passive in the background before latency hits 30 sec or more for an operation?
( In a shared host scenario like this, 30 sec disk latency spikes would be considered unbearable )
Basically, I want to know when the thing breaks, and how to avoid it breaking too early. If you need to sacrifice throughput to get this, I'm fine with that. But, how does it break and when. And how do you avoid it?