Add support for Linksys EA6350 v3

jeff · March 20, 2019, 2:14pm

Thanks, yes, that was was I was hoping/expecting to see.

The boot loader writes a new "record" on each boot consisting of

11 08 11 20 -- "magic number" (marker)
01 00 00 00 -- boot count
12 08 11 20 -- checksum

From what I can tell on my EA8300, if the boot count is 3, the boot loader rewrites the environment and sets the next boot count to 1. I haven't confirmed it, but this should be what "flips" the partition on a boot-loop situation.

On "successful" boot, the OS is responsible to write a "0" record so that the boot loader knows boot went well.

The record is not overwritten, but appended. This reduces flash wear and is perhaps also the reason that boot_count from the environment is not used. (I've never seen anything but 0 in there and 3 in boot_part_ready). When the partition is "full", it gets erased and starts over -- only the last-written record has significant value.

The "challenge" is that NAND flash typically has a minimum write size, 2048 bytes, for example. So on a NAND-based Linksys device, these boot-count records are 2048 bytes long. When the Linksys device(s) was ported to the IPQ40xx platform, someone noticed that it was writing 16-byte records, not the 1-byte record that the MTD parameters show, and hard-coded 16 in for the platform in general. As a result, the "pure NAND" EA8300 had some pretty strange boot behavior because the code to reset the boot count was silently failing.

I've since rewritten package/system/mtd/src/linksys_bootcount.c to

Actually log success and failure
Auto-detect if the record size should be 16 or the MTD parameter (E6350v3 being the only exception I have found)
Return a meaningful error value (and update init scripts to ignore it, so as not to stop boot)

I'll make sure it's clean and put in a PR/patch as it impacts more than just the EA8300. If you have a chance to check it, that would be great!

If you like to amuse yourself, here's what I run from /etc/profile to "see" this in action. Yes, my s_env partition "rolls" one a week or so. Yes, it's an ugly script, but cut-and-paste for what was intended to be a diagnostic tool was the fastest as it evolved

root@OpenWrt:~#  boot-info.sh 
rootfs: mtd13
boot_part=2
boot_count=0
89 bootcount entries. Last ten: 02 00 01 00 01 00 01 00 01 00

#!/bin/sh

printf "rootfs: mtd%i\n" $( cat /sys/devices/virtual/ubi/ubi0/mtd_num )
fw_printenv boot_part
fw_printenv boot_count
printf "%i bootcount entries. Last ten: " $( hexdump -C /dev/mtd8ro | sed -nEe 's/^[0-9a-f]+00  11 08 11 20 ([0-9a-f][0-9a-f]).*$/\1/p' | wc -l )
hexdump -C /dev/mtd8ro | sed -nEe 's/^[0-9a-f]+00  11 08 11 20 ([0-9a-f][0-9a-f]).*$/\1/p' \
    | tail -n 10 | tr '\n' ' '
printf "\n"