I have a doubt about how OpenWrt and the Linux kernel are managing UDP sockets. I'm using OpenWrt 18.06.1 with kernel version 4.14.63.
I have a C application, cross-compiled to work with OpenWrt on the target hardware, that is using sockets to send UDP packets.
When sending data over the socket I can use sendto() or write(). I noticed that if I try to write over a full UDP buffer (for instance if I try to write too much data with respect to what the wireless network can deliver), these calls are actually blocking my application.
But the question is: how long does the blocking time last? As far as I was able to observe, these calls seems to block for much more time with respect to what is required to free the space for a single packet, letting the buffer get emptied much more. This seeems to influence the packet drop in the kernel, before the WNIC, when trying to offer a large amount of data.
Does this depend on the Linux kernel scheduler? Does it depend on internal operations that copy packets from the UDP buffer towards the WNIC, maybe in blocks only?
I just tried this on a Linux laptop: the behaviour is actually the same.
I was also able to test this behaviour in a better way with the network measurement program iPerf, in UDP mode, by opening a client on one Linux laptop and a server on a another Linux laptop connected to the same network, trying to offer a big amount of traffic (with -b 100M) in order to saturate the buffer.
The result is reported in the following plot, obtained using a modified iPerf version, which was patched to output the internal iPerf delay at each iteration (which should be the delay that should be kept between packets to respect the user specified bandwidth (for instance, 100Mbit/s), depending on the previous loop time - a negative value means that the program should run x ms faster the next iteration due to a slow previous iteration, if I got it correctly) together with the number of bytes in the UDP socket buffer.
Due to the algorithm inside iPerf, if I properly understood it, the delay value is reset when it reaches a certain threshold, which, in my case, was set to > 50 ms of loop time (< "-50 ms" of delay).
The write() call seems to be blocked for enough time to free about 159.64 kB, which is much more than a single packet (1470 B of payload, as set by iPerf).
Moreover, if a socket timeout is set, I observed that it seems to properly expire (for instance, if I set a 20 ms timeout, the loop time seems to never take more than 20 ms + a little amount), but without giving any error, due to the buffer being now free to accomodate much more than a single packet (thanks to the write() blocking the application for a good amount of time).
The behavious seems to be exactly the same on OpenWrt.
Your expectation that buffer space should get freed up one packet at a time is probably incorrect, since packets are transmitted in larger groupings due to MPDU aggregation etc. Also, it would be less efficient to wake up the application so frequently just to refill the socket buffer when it still has a significant amount of buffered data in the buffer to send.
Thank you for your reply! This completely makes sense! So, as far as I understood, the blocking time, together with how much the buffer is freed up in the mean time, is depending on both the OS and the driver/WNIC (?), and it is typically more than the time needed to free up the space for a single packet.
In order to have more detailed data, though, probably the kernel code has to be analyzed more in details...