Asus RT-AC88U (HW: A6) broken in 22.03.3?

What I found until now is that after adding "nvmem-cells" like my message above AND disable reading data in brcm_nvram_read (from linux-5.10.161/drivers/nvmem/brcm_nvram.c) the router boots. See the log https://pastebin.com/mwb8jJqK

See https://pastebin.com/tspCRhrj for the dts file and the function code.

(I am not familiar with all this code not this buildsystem and patience is not one of my virtues so I might be wrong.)

This is a good finding. Should help @rmilecki when they're available to fix this.

Another observation. Using

nvram@1c080000 {
    compatible = "brcm,env", "nvmem-cells"

The router boots but does not start ethernet. For a log see https://pastebin.com/FYNp7W23

@arinc9 @rmilecki

The problem seems to be that brcm_nvram_read accesses its (mapped) io memory. When doing so the correct data is read (it is the same as during init) but after that the mtd/ubi process fails to work.

The bcm47xx_nvram.c code has buffered the nvram data so the cells value can be read from there.
As there is curently no api for it (bcm47xx_nvram_getenv does a lookup by name and adds a \0)
I added bcm47xx_nvram_read and use that from brcm_nvram_read.

int bcm47xx_nvram_read(unsigned int offset, char *val, size_t val_len) {
	if (!nvram_len)
		return -ENXIO;

	if ((offset+val_len) > nvram_len)
		return -EINVAL;

	while (val_len--)
		*val++ = nvram_buf[offset++];

	return 0;
}

I guess the problem has been there for a while but brcm_nvram_read was never called until the et1macaddr node was added to the dts.

See https://pastebin.com/XwVa3W3u for what I did.

1 Like

Thanks for doing this! Would you be willing to submit this to Linux mainline so Rafal can backport it to OpenWrt?

Uhh. I've never done something like that. Will take me some time.

Is there a way to test this on mainline?

You could compile mainline kernel with an initramfs built by buildroot to test it on the device but since the bootloader is not U-Boot, the final image would require a bit of changing, trx image format, lzma compression, appending the dtb, and whatnot.

Rule of the thumb is if your fix on a specific driver works on the kernel OpenWrt uses (currently 5.15 on snapshot), and if the driver on the kernel OpenWrt uses is not too different from the mainline Linux (currently 6.2-rc6), it will likely work on mainline so you can submit it.

I can send the patch on behalf of you, pointing you as the author if you'd like.

Gave it a try. See https://drive.google.com/drive/folders/1pYBOkIv3Dub-vV8YCmpvoC4-6XkXe0Uc
I am having a hard time coming up with descriptions and subject lines.
I hope can can find some time to look at them.

Everything looks perfectly fine to me, I can submit it as is if you're fine with it.

Thanks for looking into this. Please submit it.

It's up, I'm leaving dealing with the patch recommendations to you.

Thanks. Can I now send v3 and up myself or is it expected to all go through you now?

No you should send it yourself. I figure you learned to set up [sendemail] on .gitconfig, format patches with git format-patch, and send patches with git send-email? I got a page here for a template and how to send to send the mail. You can compose a mail before the patches with git send-email --compose.

https://arinc9.com/Submit-Patch-Template-c6c8026463ee49008f1b08b74e087016

Don't forget to add me to the CC!

I sent a mail but your mail provider thought it was spam for some reason. Here it is:

Willem-Jan de Hoog (2):
The bcm47xx code makes a copy of the NVRAM data in ram. Allow access
to this data so property values can be read using nvmem cell api.
The bcm47xx module has a copy of the NVRAM data in ram. When available
use this one instead of reading from io memory since it causes
mtd/ubi to fail.

I believe you're supposed to put the subject of the patches here, keep that in mind for future contributions. No need to resubmit as this is just composition.

I don't get it. Your 'response' email is on the kernel mailing lists but my patches do not appear at https://lkml.org/lkml/2023/2/7

I had this issue too with my mail provider Zoho. They add this "Delivered-To:" tag if a mail is also CC'd to myself. It's likely your provider does something to the outgoing mail that the mailing list doesn't like. Just for that reason I'm sending my patches from my gmail email address until I migrate to a proper mail provider.

You can add --from original@mailaddress.com to git send-email to keep the original author email address when you're sending through Gmail.

You need to create an app password on Google to set up git send-email.

Taking a step back, this was probably tricky to debug because this issue (apparently) involves reading NVRAM data using NVMEM interface. For that to happen you need both:

  1. DT changes adding NVMEM cell and reference to it
  2. NVRAM NVMEM-based-driver change adding support for cells

That explains why there 2 different commits pointed while debugging this.

Now, the real question (and still unanswered I believe) is, why reading NVRAM using NVMEM based driver (drivers/nvmem/brcm_nvram.c) breaks something.

Sure, you can workaround issue by stopping NVMEM based driver from mem-based I/O, but that is not a real solution. It's a workaround of fetching data using different interface (I guess you make it use mtd subsystem).

I don't have any real idea why this may be happening.

Because of this comment I assumed the copy in RAM was meant to be always used. Also bcm47xx_nvram_getenv seems to use it.That is why I proposed a read function.

Do you have any tips on where I can start looking to find what is causing this problem?

I kept thinking about this problem.

I based all my brcm,nvram development & testing on devices with serial flash used for bootloader & NVRAM. I never tried any device with NVRAM in NAND.

FWIW:
Serial flash is mapped at 0x1e000000
NAND flash is mapped at 0x1c000000

I think there may be some unexpected NAND controller behaviour caused by using NAND flash content mapping. My guess is that reading NAND content using mem IO somehow affects NAND controller state.

I'll try to contact Broadcom / NAND controller people see if they can help.