Adding OpenWrt support for Xiaomi AX3600 (Part 1)

Without the smp_affinity_settings the transfer was loaded only on core0 and that more or less maxed out.

I was just doing some tests as some users asked.

Sure, I was reffering to the case where no core was maxed out

You are right but as a quick test I usually do it this way.

BTW the exact same wireless transfer maxes out gigabit on stock firmware.

Stock interrupts just for the interested

root@XiaoQiang:~# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  3:     566786      98541      90428    3341065       GIC  20 Edge      arch_timer
 40:          1          0          0          0   msmgpio  34 Edge      soc:gpio_keys
 76:          0          0          0          0       GIC 270 Level     bam_dma
 77:        396          0          0          0       GIC 340 Level     msm_serial0
 79:          2          0          0          0       GIC 127 Level     78b5000.spi
 80:          0          0          0          0       GIC 357 Edge      q6_wdog_interrupt
 81:          0          0          0          0       GIC 344 Edge      7803000.sdcc1ice
 82:          3          0          0          0       GIC 354 Edge      smp2p
 83:          0          0          0          0       GIC 276 Edge      tzerror
 88:     134610          0          0          0       GIC 409 Edge      nss_empty_buf_sos
 89:      49247          0          0          0       GIC 410 Edge      nss_empty_buf_queue
 90:          0          0          0          0       GIC 411 Edge      nss-tx-unblock
 91:    7046628          0          0          0       GIC 412 Edge      nss_queue0
 92:          0       4603          0          0       GIC 413 Edge      nss_queue1
 93:          0          0       4503          0       GIC 414 Edge      nss_queue2
 94:          0          0          0      61867       GIC 415 Edge      nss_queue3
 95:          0          0          0          0       GIC 416 Edge      nss_coredump_complete
 96:          0          0          0          0       GIC 417 Edge      nss_paged_empty_buf_sos
 97:      62821          0          0          0       GIC 422 Edge      nss_empty_buf_sos
 98:          0          0          0          0       GIC 423 Edge      nss_empty_buf_queue
 99:          0          0          0          0       GIC 424 Edge      nss-tx-unblock
100:    6544252          0          0          0       GIC 425 Edge      nss_queue0
101:          0          0          0          0       GIC 426 Edge      nss_queue1
102:          0          0          0          0       GIC 427 Edge      nss_queue2
103:          0          0          0          0       GIC 428 Edge      nss_queue3
104:          0          0          0          0       GIC 429 Edge      nss_coredump_complete
105:          0          0          0          0       GIC 430 Edge      nss_paged_empty_buf_sos
106:          0          0          0          0       GIC  35 Level     watchdog bark
107:         88          0          0          0       GIC 353 Edge      qcom,glink-smem-native-xprt-modem
108:          0          0          0          0       GIC 239 Level     bam_dma
109:     100429          0          0          0       GIC 178 Level     bam_dma
111:          0          0          0          0       GIC  84 Edge      qcom-pcie-msi
132:          4          0          0          0       GIC 348 Edge      ce0
133:     622897          0          0          0       GIC 347 Edge      ce1
134:     135587          0          0          0       GIC 346 Edge      ce2
135:       7790          0          0          0       GIC 343 Edge      ce3
137:         60          0          0          0       GIC 443 Edge      ce5
139:       8323          0          0          0       GIC  72 Edge      ce7
140:   20467709          0          0          0       GIC  71 Edge      ce8
141:          0          0          0          0       GIC 334 Edge      ce9
142:          0          0          0          0       GIC 333 Edge      ce10
143:          0          0          0          0       GIC  69 Edge      ce11
147:          8          0          0          0       GIC 326 Edge      host2rxdma-monitor-ring3
148:          0          0          0          0       GIC 325 Edge      host2rxdma-monitor-ring2
149:          8          0          0          0       GIC 324 Edge      host2rxdma-monitor-ring1
150:          0          0          0          0       GIC 323 Edge      reo2ost-exception
151:          0          0          0          0       GIC 322 Edge      wbm2host-rx-release
152:        160          0          0          0       GIC 321 Edge      reo2host-status
153:          0          0          0          0       GIC 320 Edge      reo2host-destination-ring4
154:          0          0          0          0       GIC 271 Edge      reo2host-destination-ring3
160:          1      87577          0          0       GIC 263 Edge      ppdu-end-interrupts-mac3
161:          0          0          0          0       GIC 262 Edge      ppdu-end-interrupts-mac2
162:          1          0          0      73383       GIC 261 Edge      ppdu-end-interrupts-mac1
163:          3          0          0          0       GIC 260 Edge      rxdma2host-monitor-status-ring-mac3
164:          0          0          0          0       GIC 256 Edge      rxdma2host-monitor-status-ring-mac2
165:          3          0          0          0       GIC 255 Edge      rxdma2host-monitor-status-ring-mac1
167:          0          0          0          0       GIC 215 Edge      host2rxdma-host-buf-ring-mac2
169:          0          0          0          0       GIC 211 Edge      rxdma2host-destination-ring-mac3
170:          0          0          0          0       GIC 210 Edge      rxdma2host-destination-ring-mac2
171:          0          0          0          0       GIC 209 Edge      rxdma2host-destination-ring-mac1
176:          0          0          0          0       GIC 191 Edge      wbm2host-tx-completions-ring3
182:          0          0          0          0       GIC 216 Edge      tsens_interrupt
183:          4          0          0          0       GIC  47 Edge      cpr3
185:        995          0      37160          0       GIC 107 Level     wlan_pci
186:          3          0          0          0  pmic_arb 3211277 Edge      spmi-vadc
187:          0          0          1          0     smp2p   1 Edge      error_ready_interrupt
188:          0          0          0          0     smp2p   0 Edge      err_fatal_interrupt
189:          0          0          0          0     smp2p   3 Edge      stop_ack_interrupt
190:          0          0          0          0       GIC 172 Edge      xhci-hcd:usb1
IPI0:     23349      61300      76753      59157       Rescheduling interrupts
IPI1:         6         11         11          9       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       Timer broadcast interrupts
IPI4:        62          0          0          2       IRQ work interrupts
IPI5:         0          0          0          0       CPU wakeup interrupts
Err:          0

Ahh... We need to wait a bit more then.

With the affinity patches, the loads are spread more evenly. Hopefully I can get my AX capable card within a week and we can do more proper tests to see where the limits are.

On the RX decap offload: are we sure that it actually works? mac80211 hooks are present? (refering to the discussion from last night)

To actually see whether router running OpenWRT is CPU-limited we would need someone to run iperf3 between wired computer and WiFi client using a full fledged (at least) 2x2 802.11ax card. If you can push 1Gbit (which is limited by gbit ethernet) without maxing CPU then you are fine.

I finally received a QCA6391 card, so I can test the PCI outside of AX9000 as well.
Thanks to @sumo for the donation.

I think that everything needed for RX decap is here as the mac80211 support for it was merged in early 2021, so it was in 5.12 as well.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/net/mac80211?h=v5.15.2&id=80a915ec4427f0083829f7e6518ee9f21521ee1e

1 Like

Just noticed that these two lines get inserted into /etc/sysctl.d/qca-nss-ecm.conf on every reboot even if they already exist?

net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1

Had maybe 10 duplicate entries.

I can confirm that. I have way more than 10, and based on the last write timestamp, it happens at every sysupgrade or reboot. (likely the first one).

MOD: no, every reboot the extra two lines are added...

1 Like

Very welcome. Keep up the good work and let me know if I can be of any further help. Thanks!

9 Likes

@robimarko

A couple hours ago more patches arrived at Kalle's repo, and look what I have found:
ath11k: Fix ETSI regd with weather radar overlap

Looks exactly like hour issue. Other patches are also added which might worth take a look.

My little bird already pointed me to that one a week ago, I applied it but I honestly didn't see a difference in the reported rules at all.
Feel free to give it a go, I might have messed something up.

I repeated same upload tests with affinity settings and speed is a lot higher and CPU now using all cores.


htop3

I could try to test "RX decap" patch, but what exactly it should improve? Lower CPU usage in LAN -> WiFi case?

My joy was a bit too quick about that patch... It only fixes a driver warning, and not the wrong logic itself, which is likely not even on the driver's side. Non the less this line gives some info away:

If the firmware (or the BDF) is shipped with these rules

The task would be to understand the reg.c logic a bit better to make sure it cannot be fixed there. For example can someone explain to me that if a channel overlaps with the radar range, why is it changing its bandwidth instead of just applying the DFS an 600 sec CAC rule to the whole overlap band? Most ETSI countries are allowed to use 5470 - 5725 @ 160), yet ath11k splits it to two parts and creates a separate band in the middle where the radar range is. My point is, this might be reg.c, but my C skills are not that good to see it in the code. Someone with a bit more experience should take a look...

MOD:

Ok, I think I found where the extra split rules are created in reg.c:

/* Add max additional rules to accommodate weather radar band */
	if (reg_info->dfs_region == ATH11K_DFS_REG_ETSI)
		num_rules += 2;

and there are some extra elements in ath11k_reg_update_weather_radar_band to create the new rule boundaries.

Question is: why? I am not an expert on regulatory, but to my knowledge if a band or channel overlaps the ETSI radar band, the only limitation would be to do DFS with a CAC timeout of 600 seconds. As to my knowledge the ETSI radar band has not further power limitation other than what the countries limits for the designated band. So there is really no reason why reg.c splits the 5470 - 5725 @ 160 range to "lower", "radar" and "upper" chanels... It should just apply 600 seconds CAC to the whole band and that should be it.

2 Likes

Reduce CPU usage on WLAN -> router

btw, @robimarko, do you build images on /releases page with defconfig, or do you added some additional options to .config?

You can see whats inside here:

lol. I intentionally looked in .github and failed to see anything like workflow yaml there.

(Although, looks like I looked it in master instead of -backports :confused:)

Thanks!

I tested "ath11k: backport RX decap offload" but can't measure any difference in CPU usage.
In all firmware versions tested "restart", "backports" and "castiel652:ath11k-decap" with frame_mode=2 and without I am getting about 46% sirq usage at 790Mbits/sec iperf3 upload from phone to LAN server.

And still can't understand how in "backports" branch this speed drops to 0 at about 3 meters from router.

One of the major changes in the backport branch was the mac80211 update to 5.15 which could have something to do with it.

btw, @robimarko, how do you think, maybe it is worth to rework uboot config a bit and merge both "rootfs" partitions together before merging IPQ...-backports back in OpenWRT repo?

Anyway I didn't find any way to make it to boot from another partition without access to the booted system or manual intervention through UART.
So this "duplcation" seems pretty useless, if swithing boot partition is only possible through UART and booted system, but it "eats" 35M, which can be used to increase the space available for flashing custom rootfs, and leave remaining for overlay.

// btw, as far as I calculated, for now overlay partition have size of ~30M (0x01ec0000 == 32243712, dividing it to 1024^2 gives 30.75), but df -h on image from /releases says it is only 15.1M O_o

UPD: Btw, I've also summed all the sizes from /proc/mtd and result is 112.75M, while spec AX3600 page on Wiki says there should be 256M.
So, it seems too big space is disappeared to blame it on inaccuracies of all the kind :man_shrugging:
(and even if there is a typo and real flash size is 128M, then 15M (128-113) is still too big amount (~10%) to blame it on inaccuracies :man_shrugging: