Have you tried with newer NSS firmware?
Yes, I tried with 11.4 originally it will crash the NSS FW as soon as the test is started.
I tried for a couple of days and then gave up.
@robimarko did you merge the 605-qca-add-add-nss-bridge-mgr-support.patch into the wrong branch, restart instead of backports, or was it intentional?
I merged it there as the PR was made against that branch, I will cherry-pick it to the backport branch soon.
@Gingernut Its been cherry-picked and pushed
My experience with @robimarko IPQ807x-5.10-backports branch (512 mb profile enabled):
- IPv6 support is broken. Random reboots every 5-10 minutes. Not usable.
- 5G: 160 Mhz does not work.
- 5G: 80 Mhz works on channels 36-48 and 149 only. Every other channel does not work.
daemon.warn hostapd: wlan1: IEEE 802.11 Configured channel (52) or frequency (5260) (secondary_channel=1) not found from the channel list of the current mode (2) IEEE 802.11a. IEEE 802.11 Hardware does not support configured channel
- Ch. 149 is unreliable, ch. 36-48 seems to work fine.
- 2.5G: 20 Mhz works fine.
- IoT radio not tested.
- Free memory never runs below 100M. If memory leak is present it is slow. Router can survive few days.
IPQ807x-5.10-backports branch: For me last two releases (past two days) causing soft brick. Last working version was 23.10 (r0-3266578). Both sysupgrade (from r0-3266578) & ubiformat (from 1.0.17) same effect. Before flash both partition have the same working version.
Please forgive my ignorance but what does the nss-bridge-mgr patch actually do? Are you going to enable this kmods on your automated builds?
It fixes roaming between ports as due to the QCA design of driver kernel has no idea of the actual FDB so they manually track stuff with the kernel module.
I haven't really decided whether to include it by default yet.
@jesdoit That is really weird, do you have a log?
@robimarko I tried to sum up our findings under the github issue which is open for a few months now. But I think it is time to write to the linux-wireless list as well. Kalle seems to be active there as well as other devs, so the best shot is probably writing a summary to there.
Question is: do we want to try anything else out before that, and a second question is, do you want to write that mail, or you leave it up to me?
I already sent an email inquiring about the regulatory problems, of course, no reply yet.
It would be great if you can sum up the issue on ath11k and send that as your today's comment looks perfect for an email there.
OK, then. Let me formalize a mail.
The same wirh me. The branch bevor the "512 mb profile enabled" worked. Then i made sysupgrade with the new branch "512 mb profile enabled" und still doesn`t work. router reboots always, I have no possibility to access via ssh or luci. What can I do ? Reset with OEM 1.0.17 per tftp ?
@Baden Yes, tftp let you recover and repeat procedure of enabling ssh, then flashing with another version. I've done this many times with recent releases to try different builds or combinations and to be sure that I haven't missed something.
@robimarko I would gladly provide, but I don't know if / how can I obtain it. After flash & reboot there is no access via network (dhcp server does not assign client address for client and reboots after while).
Same there, but when I try to revert to xiaomi's 0.17, the blue led stays blinking forever, and after reboot it stucks. Logs attached. Factory fw upload: Bytes transferred = 28312508 (1b003bc hex) ipq807x_eth_halt: done LoadAddr=44000000 NetBootFileXferSize= 1b003bc CRC verify success! RSA signature verify success! Erasing NAND... Erasing at 0x6e0000 -- 100% complete. Writing to NAND... OK Upgrade xiaoqiang_version... Upgrade root.ubi... --- xq_flash_erase Erasing Nand...0x00a00000~+0x023c0000 Erasing at 0x2da0000 -- 100% complete. --- xq_flash_erase Erasing Nand...0x02dc0000~+0x023c0000 Erasing at 0x5160000 -- 100% complete. common/proc_xqimage.c xqimage_upgrade 541 start:0x440002ac,subh->flash_addr:0xffffffff,len:0x1b00000 Erasing NAND... Erasing at 0x6e0000 -- 100% complete. Writing to NAND... OK ========Upgrade success!======== Erasing NAND... Erasing at 0x6e0000 -- 100% complete. Writing to NAND... OK Stucks here, after manually pluggin' the power in/out: Format: Log Type - Time(microsec) - Message - Optional Info Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic S - QC_IMAGE_VERSION_STRING=BOOT.BF.3.3.1-00147 S - IMAGE_VARIANT_STRING=HAACANAZA S - OEM_IMAGE_VERSION_STRING=CRM S - Boot Config, 0x000002e5 B - 201 - PBL, Start B - 2734 - bootable_media_detect_entry, Start B - 3442 - bootable_media_detect_success, Start B - 3446 - elf_loader_entry, Start B - 6114 - auth_hash_seg_entry, Start B - 6357 - auth_hash_seg_exit, Start B - 68574 - elf_segs_hash_verify_entry, Start B - 131269 - PBL, End B - 143350 - SBL1, Start B - 195627 - GCC [RstStat:0x10, RstDbg:0x600000] WDog Stat : 0x4 B - 202154 - pm_device_init, Start B - 323452 - PM_SET_VAL:Skip D - 120658 - pm_device_init, Delta B - 325709 - pm_driver_init, Start D - 5337 - pm_driver_init, Delta B - 332236 - clock_init, Start D - 2165 - clock_init, Delta B - 336201 - boot_flash_init, Start D - 11773 - boot_flash_init, Delta B - 351756 - boot_config_data_table_init, Start D - 3202 - boot_config_data_table_init, Delta - (575 Bytes) B - 359259 - Boot Setting : 0x00000600 B - 363194 - CDT version:2,Platform ID:8,Major ID:1,Minor ID:0,Subtype:16 B - 370117 - sbl1_ddr_set_params, Start B - 373960 - CPR configuration: 0x300 B - 377437 - cpr_init, Start B - 380213 - Rail:0 Mode: 5 Voltage: 800000 B - 385367 - CL CPR settled at 752000mV B - 388204 - Rail:1 Mode: 5 Voltage: 880000 B - 392382 - Rail:1 Mode: 7 Voltage: 896000 D - 16470 - cpr_init, Delta B - 399275 - Pre_DDR_clock_init, Start B - 403301 - Pre_DDR_clock_init, End B - 406687 - DDR Type : PCDDR3 B - 412299 - do ddr sanity test, Start D - 1067 - do ddr sanity test, Delta B - 417118 - DDR: Start of HAL DDR Boot Training B - 421876 - DDR: End of HAL DDR Boot Training B - 427549 - DDR: Checksum to be stored on flash is 20926611 B - 437766 - Image Load, Start D - 224328 - QSEE Image Loaded, Delta - (1373936 Bytes) B - 662185 - Image Load, Start D - 61 - SEC Image Loaded, Delta - (0 Bytes) B - 669871 - Image Load, Start D - 10706 - DEVCFG Image Loaded, Delta - (26004 Bytes) B - 680668 - Image Load, Start D - 25254 - RPM Image Loaded, Delta - (105964 Bytes) B - 706014 - Image Load, Start D - 96685 - APPSBL Image Loaded, Delta - (590068 Bytes) B - 802851 - QSEE Execution, Start D - 61 - QSEE Execution, Delta B - 808646 - USB D+ check, Start D - 0 - USB D+ check, Delta B - 815051 - SBL1, End D - 673989 - SBL1, Delta S - Flash Throughput, 6749 KB/s (2096547 Bytes, 310637 us) S - DDR Frequency, 466 MHz S - Core 0 Frequency, 800 MHz U-Boot 2016.01 (Feb 19 2020 - 10:39:20 +0000), Build: jenkins-r3600_ota_publish-25 DRAM: smem ram ptable found: ver: 1 len: 4 512 MiB NAND: ONFI device found ID = 1590aaef Vendor = ef Device = aa SF: Unsupported flash IDs: manuf ff, jedec ffff, ext_jedec ffff ipq_spi: SPI Flash not found (bus/cs/speed/mode) = (0/0/48000000/0) 256 MiB MMC: sdhci: Node Not found, skipping initialization PCI Link Intialized PCI1 is not defined in the device tree In: serial@78B3000 Out: serial@78B3000 Err: serial@78B3000 machid: 8010010 MMC Device 0 not found eth5 MAC Address from ART is not valid write phy_id=1, reg(0x8074):0x0670 write phy_id=2, reg(0x8074):0x0670 write phy_id=3, reg(0x8074):0x0670 write phy_id=4, reg(0x8074):0x0670 bootwait is on, bootdelay=5 Hit any key to stop autoboot: 0
Stays in loop...
Any idea how to recover from this state ?
I would try to boot an initramfs version of Robi's IPQ807x-5.10-backports
If that succeeds, you can write the prebuilt image as it was discussed many times in this topic.
Yeah, please just use initramfs for recovery, much safer than using U-boot.
I presume that bootloop you had is due to stupid Xiaomi dual boot implementation in which sometimes it boots the old kernel with new rootfs partition
FYI, this is the memory utilization for a day on my AX6:
As you can see, it can get pretty low from time to time, and if you look at the times, it gets bad when everybody is sleeping It can go well below 100MB, more like 40-50MB. At least no OOM (yet).
So it basically follows what the figured out so far, it leaks really bad when there is no usage
Yeah that is pretty much confirmed. And I am not sure if the 512MB patch actually solves anything, as it may be that we just gain enough time so it survives normal inactive periods. But I am not so sure if it would be the same if we would test this in a controlled environment with a client that can stay inactive for days...
I agree, it's just kind of masking the problem and limiting the leak size, nothing else.