OpenWrt 21.02.0 third release candidate

Right but take a look at LuCI commits, there have been a ton of fixes since rc3. If there was a fix you needed you can always install a 21.02-snapshot.

The Luci commits you refers to is also two days old now, this was in the beginning only hours…
The master branch seems to be where the works are happening now.

But I thought more like this forum tread is 160 posts now and we actually now discuss more how a network router cpu and a switch works than actual faults. We should probably need a big DSA tread where everyone can discuss how DSA works because it seems to be a hot topic.

So from my viewpoint it feels like 21.02 need to try its wings or it will probably never really fly because it will never be fault free, in the best of worlds that version will be named 21.02.7 or maybe 8.
Alternatively what exactly are we waiting for now?

I would be very happy to see the bug regarding DHCPv6 fixed soon (no leases and no IPv6-PD if leasetime <12h)

1 Like

It appears to me that the processing overhead for DSA's switch tags is a little bit larger than for VLAN tags. If I understand DSA correctly the VLAN tags are replaced with the DSA tags, at least for the top VLAN tag. Do you happen to know how VLAN tagged ports forwarding several VLANs are handled by DSA? Does that also work with just a switch tag and the switch chipset inserts the corresponding VLAN tag for egress frames? Or does the CPU need to push the VLAN tag and the switch tag?

No worries! Ethernet traffic between switch ports is handled by the switch chipset. So you get also full throughput with DSA. And I'd like to add that WiFi bridged traffic has to pass the CPU too, since the WiFi interfaces are connected to the CPU.

The 5.10 kernel is getting some speed improvements:

https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=64ed3d80567280e5cccb4c4642464223862dabc6

Sadly, 21.01 comes with the 5.4:

2 Likes

This pages shows what a DSA tag looks like for Marvell based switch devices:

https://www.tcpdump.org/linktypes/LINKTYPE_DSA_TAG_DSA.html

3 Likes

Good post, didn't notice that on the rc2 thread. Can't we just install kernel 5.10 on most devices though and get these improvements? I know many people are running 5.10 on mvebu with little to no issues.

I see. Marvell's switch tag (DSA and EDSA type) can carry the internal data of the top VLAN tag and has a flag to indicate the tagged/untagged state. Qualcomm's tag is just two bytes and has also a flag for tagged frames, but doesn't include any VLAN details. Broadcom's tag is four bytes and seems not to carry any VLAN related details. And the tag may be placed before the DA/SA MACs or directly after.

If you need VLAN tagged frames (some ISPs require PPPoE in a specific VLAN) then it depends on the switch chipset if the CPU has to just push/pop the switch tag or also the VLAN tag. At first glance I'd prefer a Marvell switch chipset for that.

This sounds quite like the issue that I reported.

Is there more detail anywhere?

Hi @CharlesJC
It is not the same issue.
The problem I reported (second reply here, and also in the bug report system) is affecting the lan side only, if you configure a leasetime (an 'IPv4 only' variable if I can say that like this !) to any value less than 12h you loose the DHCPv6 service (no DHCPv6 lease given to any hosts, Windows or Linux etc.., so also no IPv6-PD to downstream routers) but you still have RA working so the hosts can still generate their SLAAC addresses.

This bugs seems to be there since a long time (rc1, rc2, rc3, snapshot ...)
it is not Arch dependant (tested on X86_64, ramips, Ath79, IQP4x ...)
This can be reproduce even without upstream connection (fresh install + LUA prefix is enough to test)
It is still there with yesterday snapshot (tested on Arcer C7 v2), no reply from anyone.
Perhaps I am the only guy on Earth who needs to have lease time less than 12h ???

1 Like

Does anyone know how to solve this irqbalance error? irqbalance is saying that is is enabled but stopped. syslog states daemon.warn /usr/sbin/irqbalance: Daemon couldn't be bound to the file-based socket.

I tried mkdir -p /run and service irqbalance restart but it is still stopped. Is irqbalance really worth it? I was testing my wireguard connection (300mbps) to my raspberry pi 4 and saw a hrtimer took XXXns message so thought about irqbalance to help.

Thanks for any advice

It looks like the issue was reported here: https://github.com/openwrt/packages/issues/15903

Does anyone know of any workarounds in the meantime?

Make sure the option enable is set to 1 in /etc/config/irqbalance. It might be a bug, but on my mvebu it seems to be working, as mwlwifi and some others are moved from CPU0 to CPU1 when I check /proc/interrupts.

It depends a lot on your hardware and what you want to achieve with it.

First of all, modern kernels seem to do IRQ balancing fairly well on their own, at least on x86-based hardware. This is, afaik, why Debian stopped recommending irqbalance as part of their default kernel image since Debian 10 (buster) while it was still recommended in the previous release.

I also have ARM-based devices on which IRQ balancing does not seem to work well by means of the kernel itself. But here I found that irqbalance didn't seem to help either. If I install and enable it, IRQs are still almost exclusivly handled by one CPU core. I have to manually assign CPU affinities to distribute IRQs of different hardware blocks across CPU cores. So, when you install it, you should actually verify whether it has any effect.

But what's even more important is that when it comes to networking, IRQ balancing is not always recommended because it may even increase latency or decrease throughput. This is because for certain tasks it is beneficial to have one CPU core handle the same packet stream because tracking of packets can be more efficient compared to when state information has to be shared or transferred to another core. So, all in all, irqbalance may help certain workloads, but you should really check whether it does in your case rather than install and forget about it.

For the reasons explained above, I would not recommend shipping it by default in general. It might help on certain targets and with certain applications, but that's not universally true.

Does anyone have an estimate of when a stable version of 21.02 will be released?

2 Likes

Also, on some hardware, some IRQs can only be serviced by a specific CPU.

2 Likes

I/O traffic should be bound to one CPU to decrease latency (CPU affinity). However, you could put the WIFI on one CPU and the SWITCH on the other. If you do lots of USB traffic you could also move its interrupt to a less used CPU. I am referencing the MVEBU aka WRT3200ACM here.

Yes, such manual optimization (pinning certain IRQ activity to specific cores) can be very useful. But this is not what irqbalance does (it tries to distribute IRQs over all cores, at least by default). And that's why I'm saying it depends on the use case and hardware whether irqbalance is actually beneficial.

2 Likes

I noticed with Sheduled tasks. We have GUI to change settings, but need do extra work with SSH. "/etc/init.d/cron restart" if we need change something.
Is possible do this every time when press save button in shedule?


ssh