In recent tests I benchmarked different VM disk backends using fio to see real-world performance on our infrastructure. All setups use the same VM and fio profiles for fair comparison.
I used two standard fio profiles to capture sequential throughput and random 4k performance, reflecting both large transfers and typical virtualization IO patterns. Each run was ~60 s with direct IO and depth tuned for each profile.
Starting with the worst performer:
NFS: Synology NAS SSD Pool over 1 Gbit/s
For this test i used a DS1815+ with a SSD Pool of 2x KingSpec 2TB SATA SSD. (Raid 0)
Key results:
- Sequential (1 MiB, iodepth 16, rw):
- ~38.9 MiB/s read, ~40.8 MiB/s write (≈ 0.038–0.040 GiB/s)
- Latency (clat avg): ~71 ms read, ~324 ms write
- read p95 ~169 ms, p99 ~234 ms
- write p95 ~651 ms, p99 ~802 ms
- 4k Random (70/30 randrw, iodepth 64):
- ~21.5 MiB/s read (~5.5k IOPS), ~9.2 MiB/s write (~2.36k IOPS)
- Latency (clat avg): ~8.7 ms read, ~6.7 ms write
- read p95 ~14.9 ms, p99 ~23.5 ms
- write p95 ~10.7 ms, p99 ~15.1 ms
The NAS over 1 Gbit/s NFS shows clear network bottlenecks for sequential tests, and high latencies under random load. This is consistent with storage benchmarking best practices: slow network = limited throughput regardless of SSD capabilities.
Takeaway: Good for light workloads or archival VM disks, not ideal for performance-sensitive virtual workloads.
Ceph Cluster (Proxmox Member)
For this test i built a 3-Node Ceph Cluster with dedicated 2.5 Gbit/s links for the Ceph Network. In this test the VM is on one of the Nodes that is also running the ceph cluster. 2 of the nodes contain one WD Blue SA510 SATA SSD 2 TB each, one node contains one Acer Predator GM7 2TB M.2 NVMe SSD. These drives are all dedicated fully to ceph as OSDs. Ceph is running with a osd_pool_default_size and osd_pool_default_min_size of 2, so of each block there should be 2 copies on different OSDs.
- Sequential (1 MiB, iodepth 16, rw):
- ~97.2 MiB/s read, ~98.0 MiB/s write (≈ 0.095 GiB/s each way)
- Latency (clat avg): ~26 ms read, ~137 ms write
- read p95 ~69 ms, p99 ~103 ms
- write p95 ~234 ms, p99 ~279 ms
- 4k Random (70/30 randrw, iodepth 64):
- ~26.0 MiB/s read (~6.66k IOPS), ~11.2 MiB/s write (~2.86k IOPS)
- Latency (avg): ~2.27 ms read, ~17.1 ms write
Insights: Ceph inherently distributes IO and benefits from parallellism across OSDs, but the cluster networking and object replication overhead still limit peak performance in modest networks. Real-world community guidance also emphasizes that network bandwidth quickly becomes the bottleneck in clustered storage, even 10 Gbit/s is often saturated in faster configurations.
Ceph from Non-Ceph Proxmox Node
This test is similar to the one before. The only difference is that this time the VM is running on a fourth Proxmox Node that is not part of the Ceph Cluster. It is just connected over Ceph RBD over the 2.5 Gbit/s dedicated network.
- Sequential (1 MiB, iodepth 16, rw):
- ~94.9 MiB/s read, ~95.6 MiB/s write (≈ 0.093 GiB/s each way)
- Latency (clat avg): ~29 ms read, ~138 ms write
- read p95 ~70 ms, p99 ~97 ms
- write p95 ~249 ms, p99 ~296 ms
- 4k Random (70/30 randrw, iodepth 64):
- ~22.7 MiB/s read (~5.81k IOPS), ~9.8 MiB/s write (~2.50k IOPS)
- Latency (avg): ~3.17 ms read, ~18.1 ms write
The results are similar to the integrated Ceph node, the difference between cluster membership isn’t large here, indicating that network path to Ceph stores and cluster topology matter more than membership alone.
Local SATA SSD
For this test the VM is running on a locally installed Intenso SATA SSD. This is fully mapped as a Local LVM Datastore.
- Sequential (1 MiB, iodepth 16, rw):
- ~200 MiB/s read, ~199 MiB/s write (≈ 0.195 GiB/s each way)
- Latency (clat avg): ~38.5 ms read, ~41.3 ms write
- read p95 ~52 ms, p99 ~264 ms
- write p95 ~54 ms, p99 ~262 ms
- 4k Random (70/30 randrw, iodepth 64):
- ~35.3 MiB/s read (~9.0k IOPS), ~15.2 MiB/s write (~3.9k IOPS)
- Latency (clat avg): ~4.8 ms read, ~5.2 ms write
This result is better than the storage dependent on networking.
Local NVMe
For this test the VM is running on a locally installed Kingston M.2 2280 NVMe SSD drive that is also functioning as the Proxmox Boot drive.
- Sequential (1 MiB, iodepth 16, rw):
- ~1.14 GiB/s read, ~1.13 GiB/s write
- Latency (clat avg): ~4.6 ms read, ~9.4 ms write
- read p95 ~13.7 ms, p99 ~19.8 ms
- write p95 ~21.4 ms, p99 ~26.1 ms
- 4k Random (70/30 randrw, iodepth 64):
- ~610 MiB/s read (~156k IOPS), ~261 MiB/s write (~67k IOPS)
- Latency (clat avg): ~0.30 ms read, ~0.24 ms write
This is the best possible scenario for a single VM disk, direct NVMe with no network or cluster overhead. Latency and throughput both exceed shared storage by a large margin.
SAN LVM
For this test i connected a datastore from an IBM V5100 over 8 Gbit/s FC SAN. This infrastructure is quite a few years old, current infrastructure performs much better. It is important to know that FC SAN is a special protocol optimized for I/O Intensive traffic. It uses its own switches, SFPs and NICs. (common Speeds: 8 Gbit/s, 16 Gbit/s, 32 Gbit/s)
- Sequential: ~850 MiB/s read/write
- Latency (clat avg): ~3.1 ms read, ~15.6 ms write
- read p95 ~5.2 ms, p99 ~6.8 ms
- write p95 ~22.7 ms, p99 ~25.3 ms
- Latency (clat avg): ~3.1 ms read, ~15.6 ms write
- 4k Random: ~208 MiB/s read (~53 k IOPS), ~89 MiB/s write (~23 k IOPS)
- Latency (clat avg): ~0.70 ms read, ~1.14 ms write
- read p99 ~1.4 ms
- write p99 ~2.0 ms
- Latency (clat avg): ~0.70 ms read, ~1.14 ms write
Modern Enterprise External Storage architecture analyzes the I/O behaviour and intelligently precaches blocks. That´s why these generic benchmarks don´t say much about the real-world performance.
Table overview
| Storage Backend | Network | Seq Read | Seq Write | Rand 4k Read | Rand 4k Write | Typical Use Case |
|---|---|---|---|---|---|---|
| Synology NAS (SSD, NFS) | 1 Gbit/s | ~40 MiB/s | ~41 MiB/s | ~5.5 k IOPS | ~2.3 k IOPS | Light VMs, backups |
| Ceph (node in cluster) | 2.5 Gbit/s | ~97 MiB/s | ~98 MiB/s | ~6.6 k IOPS | ~2.9 k IOPS | Shared VM storage |
| Ceph (external node) | 2.5 Gbit/s | ~95 MiB/s | ~96 MiB/s | ~5.8 k IOPS | ~2.5 k IOPS | Flexible placement |
| Local SATA SSD | – | ~200 MiB/s | ~199 MiB/s | ~9.0 k IOPS | ~3.9 k IOPS | General-purpose VMs |
| Local NVMe SSD | – | ~1140 MiB/s | ~1135 MiB/s | ~156 k IOPS | ~67 k IOPS | IO-intensive workloads |
| FC SAN V5100 | 8 Gbit/s (FC) | ~850 MiB/s | ~850 MiB/s | ~53 k IOPS | ~89 k IOPS | Enterprise |
Test configs
Seq-RW Test:
[global]
ioengine=libaio
rw=readwrite
bs=1m
size=10G
iodepth=16
direct=1
runtime=60
time_based
group_reporting
filename=/path/to/rbd/device/or/filesystem/file
[test]
Rand4k Test:
[global]
ioengine=libaio
rw=randrw
rwmixread=70
bs=4k
size=8G
iodepth=64
runtime=60
direct=1
time_based
group_reporting
filename=/path/to/rbd/device/or/filesystem/file
[test]