SMB & NFS speed

I have an Archer C7 v2 using OpenWrt from https://github.com/gwlim/openwrt-sfe-flowoffload/tree/master/MAR-2020 (because in the past I had very unreliable WiFi with normal OpenWrt)

Now I added some USB hard drive (ext4) to the router, read & write speed is likely limited by USB2 (ca 25 MiB/s for both).
When trying to setup the drive for network access I initially used NFS (server version 2.3.4-3) because I read that it is lighter on CPU and recommended if there is no Windows system involved.
Speed is 13 MiB/s read, 15 MiB/s write, and it seems to be limited by CPU (some 30% usage by ksoftirqd, most of the rest to 100% by 8 nfsd processes). Options like version, async, wsize, rsize don't change this.
I tried samba (server version 3.6.25-14), and got 14 MiB/s read and 21 MiB/s write when mounting with vers=1.0, and the CPU does not seem to be the limiting factor here (read speed is limited by WiFi speed, in iperf3 I get 150-170 Mbit/s from router to client)
Using SMB version 2 has 11 MiB/s read and 16 MiB/s write (definitely CPU limited, but still about as fast as NFS).

So my question is: Why is NFS slower/heavier on CPU? Is this normal, or can I improve it somehow?
I am ok with the speeds of SMB v1, but I would like to avoid it as I read this is rather insecure (although this might not be relevant for a private network).

Update:
vsftpd read is same as SMB 1, write is 20 MiB/s with vsftp using somewhat more CPU than smbd when writing (62% smbd, 68% vsftpd, tested several times)
So I will not get more than that, but still don't understand why NFS is considerably slower...

1 Like

NFS speed is increased to SMB levels when reducing the number of NFS threads.
Why are more threads so much slower, and can it cause any problems if I have the number set to 2 or 3 instead of 8?

1 Like

perf is your friend ( not easy, but has all the answers you need )...

all hardware is different ( and likely software/fs/proto/opts/ossched setups )... rather than trying to understand each setting... best to;

  • start with the most common...
  • benchmark it a few steps each time
  • settle on something quasi optimal
  • rinse and repeat for another config parameter

the higher the software level, the easier to resolve / find simple answers to...