Re: Performance issues

darren@xxxxxxxxxxxx · Sat, 2 Aug 2025 17:13:06 +0100

As you rightly point out the 110MB/s it sounds very much like the traffic is going through the wrong interface or being limited.

So am I correct in my reading of this that this a virtual Ceph environment running on Proxmox?

What do you mean by this statement? " All Ceph drives are exposed and an NFS mounted NVME drive. “

Do I take this to mean that you 4 servers are all mounting the same NVME device over NFS? Just a bit confused as to the exact hardware setup here.

What is the performance you can get from a single Ceph OSD? Just do a simple dd to read not write to an OSD drive.

Darren

> On 2 Aug 2025, at 15:25, Ron Gage <ron@xxxxxxxxxxx> wrote:
> 
> Hello from Detroit MI:
> 
> I have been doing some limited benchmarking of a Squid cluster. The arrangement of the cluster:
> Server        Function
> c01             MGR, MON
> c02             MGR, MON
> o01            OSD
> o02            OSD
> o03            OSD
> o04            OSD
> 
> Each OSD has 2 x NVME disks for Ceph, each at 370 Gig
> 
> The backing network is as follows:
> ens18        Gigabit, mon-ip (192.168.0.0/23) regular MTU (1500)
> ens19        2.5 Gigabit, Cluster Network (10.0.0.0/24) Jumbo MTU (9000)
> 
> Behind all this is a small ProxMox cluster.  All Ceph machines are running on a single node.  All Ceph drives are exposed and an NFS mounted NVME drive.  All Ceph OSD drives are mounted with no cache and single controller per drive.  Networking bridges are all set to either MTU 9000 or MTU 1500 as appropriate.
> 
> iPerf3 is showing 2.46 Gbit/sec between servers c01 and o01 on the ens19 network.  Firewall is off all the way around.  OS is CentOS 10.  SELinux of disabled.  No network enhancements have been performed (increasing send/rcv buffer size, queue length, etc).
> 
> The concern given all this: rados bench can't exceed 110 MB/s in all tests.  In fact if I didn't know better I would swear that the traffic is being either throttled or is somehow routing through a 1Gbit network.  The numbers that are returning from rados bench are acting like saturation at Gigabit and not exhibiting any evidence of being on a 2.5 Gbit network.  Monitoring at both Ceph and ProxMox consoles confirm the same.  Cluster traffic is confirmed to be going out ens19 - tested via tcpdump.
> 
> Typical command line used for rados bench: rados bench -p s3block 20 write
> 
> What the heck am I doing wrong here?
> 
> Ron Gage
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx