[Solved] Zyxel NBG6817 flashing from OEM

Please post the output of ./scripts/diffconfig.sh (and are you using git HEAD or a stable branch, HEAD preferred).

here is the output of the script:

CONFIG_TARGET_ipq806x=y
CONFIG_TARGET_ipq806x_DEVICE_NBG6817=y
CONFIG_TARGET_BOARD="ipq806x"
# CONFIG_DRIVER_11AC_SUPPORT is not set
CONFIG_IB=y
CONFIG_IB_STANDALONE=y
# CONFIG_PACKAGE_ath10k-firmware-qca9984 is not set
CONFIG_PACKAGE_block-mount=y
CONFIG_PACKAGE_blockd=y
# CONFIG_PACKAGE_e2fsprogs is not set
# CONFIG_PACKAGE_kmod-ata-core is not set
# CONFIG_PACKAGE_kmod-ath10k is not set
CONFIG_PACKAGE_kmod-ath9k=y
CONFIG_PACKAGE_kmod-ath9k-common=y
CONFIG_PACKAGE_kmod-crypto-crc32=y
CONFIG_PACKAGE_kmod-crypto-crc32c=y
CONFIG_PACKAGE_kmod-fs-autofs4=y
CONFIG_PACKAGE_kmod-fs-ext4=y
CONFIG_PACKAGE_kmod-fs-f2fs=y
# CONFIG_PACKAGE_kmod-leds-gpio is not set
CONFIG_PACKAGE_kmod-lib-crc16=y
# CONFIG_PACKAGE_kmod-scsi-core is not set
# CONFIG_PACKAGE_kmod-usb3 is not set
# CONFIG_PACKAGE_libblkid is not set
# CONFIG_PACKAGE_libext2fs is not set
# CONFIG_PACKAGE_librt is not set
# CONFIG_PACKAGE_libsmartcols is not set
# CONFIG_PACKAGE_libuuid is not set
# CONFIG_PACKAGE_losetup is not set

This, when building git HEAD from source (I've never tried imagebuilder), should result in a fully functioning nbg6817:

### Use "make defconfig oldconfig" to expand this to a full .config

CONFIG_TARGET_ipq806x=y
CONFIG_TARGET_DEVICE_ipq806x_DEVICE_NBG6817=y
CONFIG_TARGET_DEVICE_PACKAGES_ipq806x_DEVICE_NBG6817=""

### SMP CPUs need irqbalance, invoke irqbalance from /etc/rc.local
CONFIG_PACKAGE_irqbalance=y

### Enable per device rootfs
CONFIG_TARGET_MULTI_PROFILE=y
CONFIG_TARGET_PER_DEVICE_ROOTFS=y

### USB device mount & filesystem support
CONFIG_PACKAGE_block-mount=y
CONFIG_PACKAGE_blockd=y
CONFIG_PACKAGE_kmod-usb-storage=y
CONFIG_PACKAGE_kmod-fs-ext4=y
CONFIG_PACKAGE_kmod-nls-utf8=y
CONFIG_PACKAGE_kmod-usb2=y

### Luci (SSL)
CONFIG_PACKAGE_luci-ssl=y
CONFIG_PACKAGE_luci-app-uhttpd=y

Obviously irqbalance, kmod-usb-storage, kmod-usb2, luci-ssl and luci-app-uhttpd are optional. I am not 100% sure if block-mount and blockd are hard requirements for the nbg6817, given that the rootfs is already referenced from the cmdline (root=/dev/mmcblk0p5) - my hunch would tend towards 'no' - but this would require checking, before submitting the corresponding addition to the default device packages (which would fix this issue once and for all).

Solved! Thanks a lot. The system works perfect with your diff.

Christoph

Do both USB ports (2.0 and 3.0) work fine with kmod-usb2 module, or kmod-usb3 is needed too?

Just managed to debrick my NBG6818 from unsucessfull flash and bootloop using TFTP method. With WPS button pressed after power on, router is looking for ras.bin on 192.168.1.99.

Thanks for documenting that. There must also be a way to toggle the boot order (the router has dual boot functionality, even if LEDE can currently only use the first set of partitions).

Need your help once more... I am trying to start using 3GB /dev/mmcblk0p10 partition as /overlay, but no success. I have included block-mount package into image, and edited fstab.conf (also tried using luci). /dev/mmcblk0p10 is visible, I can "enable" it and define mount point as /overlay but /overlay keeps mounted to /dev/loop0.
I can mount /dev/mmcblk0p10 to /overlay using "mount /dev/mmcblk0p10 /overlay" but it does not survive reboots.

The easiest approach would probably be to configure an extroot on /dev/mmcblk0p10.

Just to clarify, I do know how to toggle the bootflag from a running/ accessible system, but there must be a way to toggle it (via button presses?) in the non-booting case.

Just flashed my Zyxel NBG6817 aswell, it was flashed on latest ABCS.7. First, I flashed to 17.01.4, which lead to the issues listed in this thread. Afterwards, I've build from latest source only using the following flags:

CONFIG_TARGET_ipq806x=y
CONFIG_TARGET_ipq806x_DEVICE_NBG6817=y
CONFIG_TARGET_BOARD="ipq806x"
CONFIG_PACKAGE_kmod-fs-autofs4=y
CONFIG_PACKAGE_kmod-fs-ext4=y
CONFIG_PACKAGE_kmod-nls-utf8=y
CONFIG_PACKAGE_kmod-usb-storage=y

blockd and block-mount don't seem to be hard requirements. Also, in the latest snapshot, kmod-usb2 seems to be activated by default for this device. Settings are saved across reboots. Wireless 2.4GHz and 5GHz also working.

A few things I've noticed so far: I get "Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2". Seems to be WARNING level. Also, LEDs for 2.4GHz and 5GHz on the front are amber instead of white (original firmware). Couldn't find a way to change them to white, as the amber color bothers me.

1 Like

It transparently falls back to /lib/firmware/ath10k/QCA9984/hw1.0/firmware-5.bin

So I wasn't hallucinating, I have never been sure if they were white or amber in the OEM firmware, but at the same time never bothered to reboot into the OEM firmware (I've only run it for a few minutes to check if the hardware were functional) for confirmation. It shouldn't be too different to change the LED colour, you probably just need to determine the correct GPIO to use and then change the device tree file accordingly, with a little luck the correct GPIO might already be revealed somewhere in the OEM firmware.

It might be interesting to check "cat /sys/kernel/debug/gpio" and compare it between OEM (with white LEDs lit up) firmware (amber), potential candidates for closer examination could be:

gpio2   : in  0 2mA pull down
gpio4   : in  0 2mA pull down
gpio5   : in  0 2mA pull down
gpio7   : in  0 2mA pull down
gpio8   : in  0 2mA pull down
gpio10  : out 1 12mA no pull
gpio11  : out 1 12mA no pull
gpio22  : in  0 2mA pull down
gpio23  : in  0 2mA pull down
gpio24  : in  0 2mA pull down
gpio25  : in  0 2mA pull down
gpio34  : in  0 2mA pull down
gpio35  : in  0 2mA pull down
gpio36  : in  0 2mA pull down
gpio37  : in  0 2mA pull down
gpio38  : in  2 10mA pull up
gpio39  : in  2 10mA pull up
gpio40  : in  2 10mA pull up
gpio41  : in  2 10mA pull up
gpio42  : in  2 16mA no pull
gpio43  : in  2 10mA pull up
gpio44  : in  2 10mA pull up
gpio45  : in  2 10mA pull up
gpio46  : in  2 10mA pull up
gpio47  : in  2 10mA pull up
gpio49  : in  0 2mA pull down
gpio50  : in  0 2mA pull down
gpio53  : in  0 2mA pull down
gpio55  : in  0 2mA pull down
gpio56  : in  0 2mA pull down
gpio57  : in  0 2mA pull down
gpio58  : in  0 2mA pull down
gpio63  : in  0 2mA pull down
gpio66  : in  0 2mA pull down
gpio67  : in  0 2mA pull down
gpio68  : in  0 2mA pull down

It might be interesting to check “cat /sys/kernel/debug/gpio” and compare it between OEM (with white LEDs lit up) firmware (amber)

Interesting approach, I will track it down!

Unfortunately, the latest snapshot didn't run well on my Zyxel. As soon as I replaced my main router, WiFi crashed after a few hours of operation. Can't say anything too specific, because the log size wasn't big enough, but it looked like kernel oops caused by WiFi. Either way: I've setup my old router again and now got plenty to time to track down issues on the Zyxel.

Comparing the output between LED on (wlan configured) and LED off (wlan off) using the OEM firmware should be even easier to check.

cat /sys/kernel/debug/gpio on OEM firmware didn't list all GPIOs, like on LEDE. White LEDs turned on or off, the GPIO output stayed the same. They don't seem to be listed there. I've activated amber LED using OEM led_ctrl utility (just script for /sys/class/leds) and I was indeed getting white + amber LED activated at the same time. Unfortunately, /sys/class/leds only lists the amber LED, no traces about the white LED so far.

 gpio-0   (mdio                ) in  hi
 gpio-1   (mdc                 ) out lo
 gpio-3   (rst_n               ) out hi
 gpio-9   (POWER               ) out hi
 gpio-26  (WiFi_5G             ) out lo
 gpio-33  (WiFi_2G             ) out lo
 gpio-38  (sdc1_dat_7          ) in  hi
 gpio-39  (sdc1_dat_6          ) in  hi
 gpio-40  (sdc1_dat_3          ) in  hi
 gpio-41  (sdc1_dat_2          ) in  hi
 gpio-42  (sdc1_clk            ) in  lo
 gpio-43  (sdc1_dat_1          ) in  hi
 gpio-44  (sdc1_dat_0          ) in  hi
 gpio-45  (sdc1_cmd            ) in  hi
 gpio-46  (sdc1_dat_5          ) in  hi
 gpio-47  (sdc1_dat_4          ) in  hi
 gpio-48  (rst_n               ) out hi
 gpio-53  (WLAN_DISABLE        ) in  lo
 gpio-54  (RESET               ) in  hi
 gpio-61  (UHS_mode            ) in  lo
 gpio-64  (INTERNET            ) out lo
 gpio-65  (WPS                 ) in  hi

Sidenote: WLAN_DISABLE is on gpio-53 (I've verified it on OEM), opposed to current LEDE implementation using gpio-6. Indeed: WLAN_DISABLE button doesn't work on current LEDE. Easy fix I guess.

//Edit: rmmod leds-gpio on OEM firmware disabled POWER and INTERNET LED, but 2.4G and 5G.

//Edit 2: white WiFi LEDs on OEM firmware are controlled by zyxel_led_ctrl utility, which uses iwpriv wifi[0|1] gpio_config 17 <status> 0 0 whereas status = 1 means disabled and 0 enabled. wifi0 is 5G and wifi1 is 2.4G.

//Edit 3: Okay, pinned down the issue. It's the same issue, which also prevents Netgear R7800 from using it's native WiFi LEDs (the R7800 guys are atm using other, rarely used LEDs instead). White 5G and 2.4G LEDs are not controlled by SoC GPIO, but Qualcomm Atheros QCA9984 GPIO (phy0 and phy1). For both devices, the Zyxel NBG6817 and the Netgear R7800, gpio-17 per PHY GPIO is used. Fixing one, will also fix the other. From what I've seen so far, it looks like it needs to be implemented in ath10k. I haven't found a way to make any ath10k GPIO writable from userspace so far - that would solve our issue though. I've got my infos from Netgear R7800 OEM firmware source and Zyxel OEM script.

That explains the situation - and puts the nbg6817 into a comparatively good situation (LEDs are usable, 'just' the wrong colour).

I've received a very interesting E-Mail from a TP-Link Archer C2600 user (nwfilardo on ath10k mailing list), featuring a method to toggle the LEDs on Qualcomm QCA9980, which also works on Qualcomm QCA9984 (NBG6817, R7800).

QCA9984 is connected via PCI, so he used a tool to gain read / write access to PCI memory registers. In Archer C2600 u-boot source code, it was revealed that GPIO 17 can be activated by setting Bit 17 on address 0x85018, which makes address 0x85000 Bit 17 usable as active-low controller for driving the LEDs.

Didn't had alot of time to look into ath10k so far, but it looks like ath10k_pci_reg_write32 or atleast ath10k_pci_write32 should do the job for exposing LEDs to sysfs. The TP-Link C2600 and Netgear R7800 communities are probably interested in this aswell.

nwfilardo posted a bash script to toggle LEDs (pcimem required), which you can try out, if you're interested: https://www.mail-archive.com/ath10k@lists.infradead.org/msg07443.html.

1 Like

These pull requests should improve support for the NBG6817.

1 Like

I have tried using fstab to mount /dev/mmcblk0p10 on /overlay or on / during boot (pivot overlay and pivot root), but none of them succeed. They both mount back to /dev/loop0
It's possible to mount it by typing mount /dev/mmcblk0p10 /overlay, but like Kostja mentioned, this won't survive the reboot.
Any help or ideas will be appreciated.

@mushr00m as stated by a few posters, installing kmod-fs-ext4 fixes this issue. @slh already posted a patch, which will hopefully get merged soon, so this will be an issue of the past. Until then, you need to dive in and build your own image e.g. using Image Builder, integrating ext4 drivers. While you're at it, you may also add other useful packages aswell. Please note, that Image Builder doesn't include luci per default. A few suggested configs are listed in this thread.

Latest LEDE snapshot still crashes after a few hours of operation, even with custom board-2.bin. Not sure, if this is a temperature issue or something else, as I've found nothing suspicious in the logs - it just reboots after a few hours.

Other than WiFi, this thing is a beast. Using SQM (layer_cake.qos), my bufferbloat went down to 7ms - 14ms on a 100/40 line. After enabling irqbalance (banned IRQ 99/eth0 & 100/eth1 and manually pinned them to CPU0 & CPU1), bufferbloat went down to very impressive 2ms - 7ms (!).

I can also confirm the recovery method posted by @Kostja_V working for recovery, aswell as overwriting LEDE with OEM firmware.

To have less issues with flashing LEDE, I think it should be best practice to run printf "\xff" >/dev/mtdblock6 on OEM firmware, reboot and then flash LEDE. This makes sure, you're on the correct partition, which eliminates the "I've flashed LEDE, but I'm still booting into OEM" issue. Also, if LEDE has issues booting after the initial flash (e.g. bootloop), try holding RESET for 10 - 15 seconds. Fixed a bootloop issue for me.