Samba SMBD limited to 25% CPU DLINK 860l

Hey geeks!
I've got a DLink 860l B1 running OpenWRT 18.06.5.
I've got a 1.5TB usb3.0 HDD ext4 formatted attached to it, it runs Samba, Transmission, block mount and all the stuff needed to recognize/mount ext4 partitions.
Everything is fine ... mostly
Whenever I am trying to copy a file from PC (Windows 10) to the shared folder(and vice-versa) through Samba it limits the transfer to about 20-25MB/s but the SMBD -F process is limited to 25% with a VSZ ~3% and won't get higher than that regardless. Even without anything else running.
I've tried some socket options tricks and the 'use sendfile = no' trick, nothing is making a difference.
I've checked the wiki page of samba and github but nothing seems to be getting the process to reach anywhere over 25% and memory availability is not a problem.
Have to mention that other processes like transmission can easily reach over 40% and up. So there must be a bottleneck from OpenWRT or a configuration of the Samba that I cannot figure out.

Please advise. ThxOxO

maybe checkout prlimit...

opkg install prlimit
prlimit -p $(pidof smbd)
cat /proc/sys/vm/min_free_kbytes
cat /proc/sys/fs/file-max

( note: click the edit pencil and paste the output below in between code tags ( < / > ) )

Am I doing it right?

root@OpenWrt:~# prlimit -p 3190
AS address space limit unlimited unlimited bytes
CORE max core file size 16777216 unlimited bytes
CPU CPU time unlimited unlimited seconds
DATA max data size unlimited unlimited bytes
FSIZE max file size unlimited unlimited bytes
LOCKS max number of file locks held unlimited unlimited locks
MEMLOCK max locked-in-memory address space 65536 65536 bytes
MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes
NICE max nice prio allowed to raise 0 0
NOFILE max number of open files 16404 16404 files
NPROC max number of processes 966 966 processes
RSS max resident set size unlimited unlimited bytes
RTPRIO max real-time priority 0 0
RTTIME timeout for real-time tasks unlimited unlimited microsecs
SIGPENDING max number of pending signals 966 966 signals
STACK max stack size 8388608 unlimited bytes

As you can see, samba has unlimited resources.

About the bottleneck: if that is true, Samba will be using 100% of the resources but 75% wasted on "wait", which is clearly not your case.

Yea... seems like it...
So what other troubleshooting can I do? Other people's samba is running over 25% so it can't be normal running like it does for me.

For me it sounds like you network connection is starving way before Samba can max out any CPU. Also: samba not running at 100% does not mean that it's limited, it means it's doing it's job efficiently.

I am wired to the router and -not relevant but- also have Gigabit network connection and can up/down well over 900Mbps. I can copy/paste between computers on the network with speeds greater than 50MB/s. How can it be efficient if it's not using anywhere near the available resources? CPU capped at 25% while transferring with 20-25MB/s yet the usb3 can reach above 70MB/s no sweat...
What do you guys suggest to try next?

The MT7621 is a dual core dual thread (what Intel calls HyperThread) processor, so Linux treats it as four "CPUs". A single threaded process will max out at 25%.


I forgot that detail...
Maybe htop is clear about the resource usage. If you're hitting 100% in one CPU, then there is nothing to do because file transfer is not and cannot easily be multithread.

1 Like

Above is the htop output while transferring a single big file though Samba. The process seems to be reaching here 100% - core 3... So the limitation is in the architecture of Samba in this case? Are there multicore / multithreaded sharing options?

File transfers are limited to a single process so what you're seeing is expected. There's an experimental kernel module for SMB transfers in master however I do not know how well it works or scales.

1 Like

18.06.x only has samba3 in packages, so you have those options:

  1. Try 19.07-rc1 which has samba4 in it, if its fits on your device.
  2. Wait a bit and try the new cifsd server with 19.07.x. (i need to backport it and enable the luci gui)
  3. Use a snapshots build, which has samba4 + cifsd (no gui atm) in it.

PS: Btw in samba4 the use sendfile = no only works if set together with

aio read size = 0
aio write size = 0

Will add a luci option for this in some of the next updates for samba4.

1 Like

The SMB protocol itself does not easily allow multithread transfer. CIFS doesn't implement it either.
There are more advanced and truly multithread alternatives such as aria but who uses Aria for local transfers? Also it's not as nice to play with Windows.

1 Like

I have been using cifsd for a few months now without any problems whatsoever, on my My Book Live (apm821xx) it delivers around 50% more throughput (~35 MB/s) than samba36 (~23 MB/s).

Care to explain? SMB is more or less standard these days even on MacOS, you can do NFS if that's your thing however. No idea what you'd want to use bittorrent on a local network for single clients are it would be highly inefficient.

SMB/CIFS being single thread in a CPU that barely touches 1.00 GHz is inefficient. If you really want to push your CPU at top, use aria. It's that easy, but clearly you are not good at understanding.

The only way of providing "multithread" so SMB/CIFS is multichannel:

1 Like

I know I should not post two consecutive things but this might be worth. This is what Microsoft says about multichannel SMB. This is clearly the solution to the given problem, irrespective of the server/client configuration. I made a screenshot to keep the information handy, and so the motivation for a new post.

To clarify, it's experimental at best on Linux and should be avoided.

See: (you can find alot more)

To clarify, the article clearly says Windows server and client.

Thanks for the useless detail.

That escalated quickly...

I don't understand the focus on CPU and "multithread" here, since there is not much we can do about it?
Samba is capable of delivering 50-200 MB/s on low-end cpu's, yet Samba3 is also quite old and smbv1/2 is more cpu hungry, compared to samba4/smbv3.
As example here is what i get on my wrt-1200ac:

I'm pretty happy, since 80-95 MB/s is pretty much what i also get directly sharing between two Win10 PC's using core-i7 cpu's, in the same network.

PS: From my experience one of the biggest problem/bottleneck is the actual USB/esata port on the device itself and how its implemented. I got much better speeds switching from USB3 to the esata port. Experimenting with linux filesystems and sector sizes can also help.

1 Like