Cisco/Technicolor DPC3848VE Linux

Hi, I have one of these cablemodems laying around here, and I wanted to have a second attempt at hacking it + making it run my own Linux.
In short, my device doesn't have secureboot enabled (i.e. boot mode = unsecure), which lets me load a custom kernel from the internal eMMC/ymodem/tftp.
There are some kernel sources for 2.6 kernels, and some "patches" for 3.x kernels and one or two repos for a 4.14 kernel (related to an asus storage device).
I started patching a 5.8 kernel I had lying around (shouldn't be too dificult to port to the last 5.x kernel, not sure about 6.x) to make it boot. The Intel Puma 6 SoC that this thing has is a really weird dual core atom + an armv6 core. The Atom part is what boots the entire SoC, and it uses a proprietary "bios"/loader (Intel CEFDK); whose sources are leaked though. It lets us disable the ARM core completely and reclaim back the ram allocated to it (so in the end, the x86 side has the full 512MB of available ram), boot bzImages from TFTP/ymodem and modify the "startup script" so it's technically possible to load a kernel that can't be made to fit on the same kernel partition that the modem uses (which is only 4MB, and my current kernel doesn't fit there).

I would like to know which kernel version should I target to make my patches so I can use the latest OpenWRT on it, and if someone has some experience with these devices, because I'm facing some issues still.

My current kernel lets me boot to a debian rootfs from a USB drive, ethernet works but only as a member of the embedded marvell switch (i.e. can't use VLANs to each switch's ports, it's all bridged together), the 5.8GHz ath10k wlan card refuses to load the calibration data from the original modem rootfs (in which the ath10k driver had the whole calibration + OTP + firmware embedded as binary blobs, not even using the firmware kernel mechanism or trying to read these from a file), and I can't make the front leds to behave as I want (i.e. the WLAN ones are handled by the qualcomm cards themselves, but the other ones seems to have been manipulated by the ARM core, which is dead now).
Other than that, the system behaves properly, the dual core Atom is really "good", and the ethernet works fine at its maximum speed of 1gbps (tested with iperf3).
The switch chip seems to be mapped to the Atom but it's hidden behind the "l2sw" (level 2 switch, which I assume it's the thing that interconnects the physical switch and the Atom and the ARM cores), and the Intel CEFDK sources have some pointers as to how to interact with the switch via MDIO. Would be cool to actually control the link between the Atom and the marvell switch so we could do VLANs properly.

My changes are here: https://github.com/cocus/puma6-kernel. Just use the "ce2600_bloat.config" config for the kernal, make it, and load the bzImage to your modem using ymodem or tftp. More info on my discoveries are here: https://github.com/cocus/dpc3848ve-stuff.

If anyone has some more insight and would like to help out, feel free to reply here. And if it's not clear, I'm not interested on the cablemodm part of the modem, just the networking side of the Atom (ethernet, wifi, usb, etc).

The kernel v5 ship have sailed, you need to aim for v6.

You'd be adding a new target (unless it can somehow be shoehorned into x86_64) and new device support through the main branch, which is currently in the process of migrating to kernel v6.6 - so that's what you'd need to be aiming at (there's no point to even look at v5.15 or v6.1 for this).

1 Like

Okay, makes sense, I'll start toying with kernel 6.6.
Should I use any configs from another x86 target as a base so the kernel is not that heavy? I assume I'll end up adding this target alongside the other x86 targets so we can easily re-use the kernel modules and packages from the other targets, right?
An interesting point is that this bootloader doesn't support fdts, so can't easily make this kernel coexist with other targets. Even worse, there are some "weird" changes on the serial/pci/usb drivers that I don't know how to make them "portable" among other targets, other than guarding the changes with #ifdefs.

I've not used the build system from openwrt in ages, had to move to yocto and I kinda forgot the essence of openwrt :frowning:

Thanks for the replies and would like to hear more if someone already did something with these SoCs. Will also update once I have time to start working on this new port.

In the end it's a judgement call of how different puma is compared to normal x86_64 and if it can coexist. Adding targets is relatively heavy on the buildbot infrastructure, the mirror space and the human resources for ongoing maintenance. If that can be avoided good, but if adding puma support would impose a heavy burden on x86_64, that might not be great either - it all depends on how different puma really turns out to be.

…and how useful the resulting device support turns out to become (cable modem support is probably out of the question, WLAN?, switch support? performance?); source-only might be an option as well.

Just to begin with, the uart has a MAX_BAUDRATE of 921600 instead of 115200, thus any baudrate you chose won't work properly if that constant is wrong.
The USB core has some changes, I think most of them can be wrapped up with quirks, but don't quote me on that. Because the fdt support is missing on the bootloader, it's not that trivial to figure out if you require to change the baudrate and the USB quirks. PCI and the eMMC (SDIO) also have some changes that even if they're minuscule, do really make a difference (work/not work).

Well, it has an ath9k (AR9381) and an ath10k (QCA9880), both in MIMO 3x3:3 configuration. A marvell 88E6172 (4x gigabit + 1x upstream to the SoC's internal Intel E100 card). USB 2.0, 512M of RAM, and 128M of eMMC storage (can't be used in its full capacity, because the original partition scheme uses an A/B for the kernel and rootfs, so you can imagine it's a lot of space wasted). The Intel CEFDK is stored on a spi memory (didn't even bother porting the SPI driver yet).
The dual core atom run at 1200MHz.
I'm using it as an x86, not x86_64. I'm not even sure if they support 64 bits (they're reported as Atom 625).

It's nothing to write home about, but it's not garbage either :slight_smile:

Ok, so this is the first roadblock I hit.
For some reason, this intel SoC uses a different BASE_BAUD that the PC. Not only that, but the 8250 device they have, uses a different IRQ than on the PC.
The SoC's base baud is: #define BASE_BAUD (14745600/16)
Where the PC's BASE_BAUD is #define BASE_BAUD (1843200/16)
And:

#define SERIAL_PORT_DFNS								\
	/* UART		CLK		PORT	IRQ	FLAGS			    */	\
	{ .uart = 0,	BASE_BAUD,	0x3F8,	4,	STD_COMX_FLAGS	}, /* ttyS0 */	\
	{ .uart = 0,	BASE_BAUD,	0x2F8,	38,	STD_COMX_FLAGS	}, /* ttyS1 */	\
	{ .uart = 0,	BASE_BAUD,	0x3E8,	38,	STD_COMX_FLAGS	}, /* ttyS2 */	\
	{ .uart = 0,	BASE_BAUD,	0x2E8,	38,	STD_COM4_FLAGS	}, /* ttyS3 */

vs the standard PC:

#define SERIAL_PORT_DFNS								\
	/* UART		CLK		PORT	IRQ	FLAGS			    */	\
	{ .uart = 0,	BASE_BAUD,	0x3F8,	4,	STD_COMX_FLAGS	}, /* ttyS0 */	\
	{ .uart = 0,	BASE_BAUD,	0x2F8,	3,	STD_COMX_FLAGS	}, /* ttyS1 */	\
	{ .uart = 0,	BASE_BAUD,	0x3E8,	4,	STD_COMX_FLAGS	}, /* ttyS2 */	\
	{ .uart = 0,	BASE_BAUD,	0x2E8,	3,	STD_COM4_FLAGS	}, /* ttyS3 */

I don't see any mechanism to change any of this. The SoC is 32 bit only, no x86_64. Also, I'm not sure of which OpenWRT targets use a pure 32 bit x86, and if they do, if they use DT for them, or just different kernel configs.

What do you suggest? Thank you!

I'd like to discuss the other changes later, since this is kinda "important" (without a serial console I can't do much testing :slight_smile: )

Seeing the words "Puma" and "chipset" together tickled my memory, because of things like this;

Apart from the educational value, I'm not sure it's worth pursuing this device - although it's always possible that a more modern kernel & drivers contain the relevant fixes.

Well, that issue happens because of the crappy interface between the Atom cores and the ARMv6 one (I think). It's really weird and I highly dislike it.
On my unit, I've completely disabled the ARMv6 and reallocated its memory to the Atom, so it can enjoy the full 512MB. I think these issues shouldn't come into play at all in this scenario.

I've already ported the neccesary changes to the tagged 6.6 kernel. Some of them weren't neccesary anymore. The only outstanding change is to "really know" if you're running on the CE2600 chipset or not, so it can adjust the base baudrate for the serial port (to 921600) and change the reboot mechanism (otherwise rebooting would shut down the system).
The USB issues were easily added to quirks if the PCI vendor id and product id match. For the SD/MMC PCI controller I just added a quirk, and the same for the "reboot" (added a quirk).
Now, for the ethernet card (an E1000), I had to modify the driver but in a non-destructive way which should still work fine on other CE4100 targets. Turns out that Intel (which designed the reference hardware) used a marvell M88E6172 switch connected directly to the embedded E1000's RMII. The patches from the old kernel only made the kernel driver to think it had a realtek phy, and faked the phy's registers so the system thinks everything's fine. I moved away from this, and added some code that properly identifies the marvell chip using their way of communicating through MDIO (which seems to be a little bit different than the driver expects for other types of phys). This way it just works. It could be expanded to make the MDIC controller to show up as a MDIO controller and register the M88E6xxx driver on it, but not for now.
SPI, I2C and GPIOs are still missing, but as previously stated, both wifi cards work just fine.

I'm still going through these changes to make them as slim as possible, and trying to figure out how to add a new early param to make the ce2600 detection work by passing a new cmdline argument.

Please note that I'm using debian bookworm as my rootfs (on a usb thumbdrive) rather than the emmc (because it's only 128mb).

Can you point out which x86 (not x64) targets might I take as a reference for my port? In the end, it's fine if these changes won't make it to the master branch, as long as I can build it and use it (and re-use the current packages for x86).

1 Like

Well, let me add a progress update. I've managed to make OpenWRT work on this thing. It works fine. However, there are two downsides. The first one is that I had to create a custom target (target/linux/x86/intelce). However, I'd like to use the generic one, because I saw that both USB_STORAGE (for testing the rootfs on a thumb drive) and MMC_SDHCI_PCI (to access the embedded flash) are set, so that should suffice; but there are two things I consider crucial that I can't use on the generic target.
The first one is a patch (120-Fix-alloc_node_mem_map-with-ARCH_PFN_OFFSET-calcu.patch) that causes the kernel to crash on startup:

[    0.369581] percpu: Embedded 31 pages/cpu s96340 r0 d30636 u126976
[    0.375856] Kernel command line: earlyprintk=intelce console=uart,mmio32,0xdffe0200 panic=2 rootdelay=2 pci=nocrs root=/dev/sda1 rw
[    0.388354] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.396246] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.403892] Built 1 zonelists, mobility grouping on.  Total pages: 126751
[    0.410693] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.416964] Initializing HighMem for node 0 (00000000:00000000)
[    0.423359] BUG: kernel NULL pointer dereference, address: 00000000
[    0.429642] #PF: supervisor write access in kernel mode
[    0.434864] #PF: error_code(0x0002) - not-present page
[    0.439993] *pde = 00000000
[    0.442871] Oops: 0002 [#1] PREEMPT SMP
[    0.446708] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.23+ #4
[    0.452547] EIP: __free_one_page+0x104/0x28c
[    0.456833] Code: c1 e0 02 8d 4f 04 8d 14 06 03 55 e4 8b 9a 84 00 00 00 89 8a 84 00 00 00 8d 94 06 80 00 00 00 8b 75 e4 89 5f 08 01 f2 89 57 04 <89> 0b 83 84 06 a0 00 00 00 01 83 c4 14 5b 5e 5f 5d c3 8d b4 26 00
[    0.475622] EAX: 00000168 EBX: 00000000 ECX: dfc16c04 EDX: d9591250
[    0.481888] ESI: d9591040 EDI: dfc16c00 EBP: d9417ea0 ESP: d9417e80
[    0.488155] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210086
[    0.494943] CR0: 80050033 CR2: 00000000 CR3: 19663000 CR4: 00000090
[    0.501211] Call Trace:
[    0.503653]  ? show_regs+0x50/0x58
[    0.507061]  ? __die+0x18/0x50
[    0.510112]  ? page_fault_oops+0x164/0x2f0
[    0.514206]  ? kernelmode_fixup_or_oops.constprop.0+0x6b/0xc0
[    0.519951]  ? __bad_area_nosemaphore.constprop.0+0xdc/0x188
[    0.525609]  ? printk_get_next_message+0x6f/0x2ac
[    0.530321]  ? bad_area_nosemaphore+0xa/0x10
[    0.534588]  ? do_user_addr_fault+0x1e8/0x3dc
[    0.538943]  ? exc_page_fault+0x47/0x118
[    0.542871]  ? doublefault_shim+0x130/0x130
[    0.547052]  ? handle_exception+0x133/0x133
[    0.551244]  ? doublefault_shim+0x130/0x130
[    0.555422]  ? __free_one_page+0x104/0x28c
[    0.559517]  ? doublefault_shim+0x130/0x130
[    0.563697]  ? __free_one_page+0x104/0x28c
[    0.567791]  ? prb_read_valid+0x24/0x30
[    0.571625]  __free_pages_ok+0x10d/0x348
[    0.575547]  __free_pages_core+0x7b/0x88
[    0.579474]  memblock_free_pages+0xa/0xc
[    0.583402]  memblock_free_all+0x16c/0x1fc
[    0.587497]  mem_init+0x2e/0x14c
[    0.590730]  mm_core_init+0x84/0x2e0
[    0.594301]  ? cpu_init+0x106/0x1a0
[    0.597788]  start_kernel+0x2cb/0x7d4
[    0.601447]  i386_start_kernel+0x43/0x44
[    0.605366]  startup_32_smp+0x156/0x158
[    0.609203] Modules linked in:
[    0.612252] CR2: 0000000000000000
[    0.615568] ---[ end trace 0000000000000000 ]---
[    0.620177] EIP: __free_one_page+0x104/0x28c
[    0.624445] Code: c1 e0 02 8d 4f 04 8d 14 06 03 55 e4 8b 9a 84 00 00 00 89 8a 84 00 00 00 8d 94 06 80 00 00 00 8b 75 e4 89 5f 08 01 f2 89 57 04 <89> 0b 83 84 06 a0 00 00 00 01 83 c4 14 5b 5e 5f 5d c3 8d b4 26 00
[    0.643228] EAX: 00000168 EBX: 00000000 ECX: dfc16c04 EDX: d9591250
[    0.649495] ESI: d9591040 EDI: dfc16c00 EBP: d9417ea0 ESP: d9417e80
[    0.655762] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210086
[    0.662550] CR0: 80050033 CR2: 00000000 CR3: 19663000 CR4: 00000090
[    0.668820] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.675528] Rebooting in 2 seconds..

Reverting this patch fixes it.
The second thing are the custom scripts for the leds and network (i.e. base-files/etc/board.d/{01_leds,02_network}).

The serial driver issue I presented before was easily fixed by adding a new "type" of earlyprintk (a fake one, so the early init code can print stuff, including the first part of the kernel), but setting it to a new type that's not supported on the "latter stage" of the kernel, makes the situation ideal. I've also set the console to load through mmio, so no need to screw up with the baudrate. In any case, part of the init code that knows if this is a ce2600 or not, fixes the serial port.
It's kinda convoluted, so I can only say thanks Intel! /sarcasm

I'd like to ask for some feedback on how to setup some aspects of the system. For instance, I'm mounting the rootfs through kernel cmdline (root=/dev/sda1 rw), which works for testing, but not for actual usage. My plan is to switch to squashfs, so it's a little bit smaller and can be "updated". HOWEVER, I'm not seeing how should I mount an overlay partition automatically with OpenWRT methods (fstab?).

And finally, there's the big issue that when prepping the system up, you need to grab a calibration file for the 5.8GHz qca988x card; otherwise it refuses to load, (which also requires the board file and firmware for it). It's really easy to retrieve this file, either by letting OpenWRT run, or by grabbing it beforehand with the stock modem's firmware.

My changes are on my github (https://github.com/cocus/openwrt/commits/testing-intelce/). I'd like to refine this change set before opening a PR. I need some feedback from developers about the changes I've added, if they're where they should be, or not.
One thing I'm considering is to add all the kernel additions as files, rather than patches (and remove the defconfigs from the patches as well, no need for them).

Ok, the big question is "where" is the overlay is being mounted. I don't have a GPT scheme, so there's no PARTLABEL available for mount_root to figure out where to mount stuff. I did format to ext4 the partition which is RIGHT AFTER the rootfs, but I think it might be using the free space of the root partition? (i.e. the partition is 17MB, but the squashfs is ~4M, so it's using the remainder ~13MB for it?)
If that's the case, cool! But, I'd like to improve this a little bit further. I'd like for openwrt to repartition the emmc the first time it boots from it.

Current partition table is as follows:

root@OpenWrt:/tmp# partx -s --output-all /dev/mmcblk0
NR  START    END SECTORS  SIZE NAME UUID                                 TYPE FLAGS SCHEME
 1   4608  12799    8192    4M      d176ee89-01                          0x83 0x0   dos
 2  12800  20991    8192    4M      d176ee89-02                          0x83 0x0   dos
 3  20992  55807   34816   17M      d176ee89-03                          0x83 0x0   dos
 4  55808 230143  174336 85.1M      d176ee89-04                           0x5 0x0   dos
 5  56064  90879   34816   17M      d176ee89-05                          0x83 0x0   dos
 6  91136  95231    4096    2M      d176ee89-06                          0x83 0x0   dos
 7  95488  99583    4096    2M      d176ee89-07                          0x83 0x0   dos
 8 100608 106751    6144    3M      d176ee89-08                          0x83 0x0   dos
 9 107008 113151    6144    3M      d176ee89-09                          0x83 0x0   dos
10 113408 117503    4096    2M      d176ee89-0a                          0x83 0x0   dos
11 117760 121855    4096    2M      d176ee89-0b                          0x83 0x0   dos
12 122112 140543   18432    9M      d176ee89-0c                          0x83 0x0   dos
13 140800 159231   18432    9M      d176ee89-0d                          0x83 0x0   dos
14 159488 188159   28672   14M      d176ee89-0e                          0x83 0x0   dos
15 188416 217087   28672   14M      d176ee89-0f                          0x83 0x0   dos

(note that partition 4 is an "extended" partition that contains the rest of partitions).
In all fairness, I need to grab a calibration file from the ext3 fs at mmcblk0p6, otherwise the ath10k doesn't start).

I saw an example on how to load the calibration data for the ath10k on the ipq40xx, but those assume you still have the original partition where the calibration data still exists. If I re-partition the emmc, completely, so the first partition is the kernel (~6M for instance), then the second partition could be a backup of the original p6 partition (only 4M), and the remainder let it be the rootfs available for openwrt, would that work? How would that work with upgrades?

Do you think it makes sense for openwrt to do the repartition and copying of the calibration data automatically, or by using a script invoked by the user? Or how could this be addressed? Because if openwrt does it automatically, and the system hangs or the power is interrupted, you lose your calibration data forever. Same would apply if you do it manually, or if you run a script, although you could make a copy beforehand to a usb stick or copy it through scp.

I haven't figured any of this from the documentation, because it assumes you have a UBI/MTD but not a emmc.

Ok, I repartitioned the eMMC so it has only 3 partitions: kernel, copy of the "nvram" partition and rootfs. This yields a lot of free space on the rootfs. Sadly, since x86 doesn't have an upgrade mechanism, the only way to properly upgrade things is through manually running dd commands. Not bad for now.

After a full day of work, I was able to expose the intel E1000 MDIC bus as a MDIO bus, and register the marvell chip using platform devices.
Incredibly, it works! It's not set up as a switch, but rather as DSA switch (because the only driver available on the kernel supports this).
I've added a script on hotplug.d/iface so it changes the MAC address of each interface based on the root MAC of the E1000 nic.


image

Certainly a great success! With these changes, this device becomes REALLY useful (maybe not up to 802.11 ax/be), but considering it was scrap... And ran a 2.6 kernel! With that ARM core as a tumor!!!

I have 6 patches to enable support for this, which I hope I'll be able to squash into two (the first one to revert that patch that causes this SoC to crash, and the other one, adding support for everything here). These are on target/linux/x86/patches-6.6, starting from a patch with the number 390, up until 395. I think the numbers are okay, if I understood the doc correctly. It's such a shame that I can't put these on something like target/linux/x86/intelce/patches-6.6.

I hope I can guard all my "horrible" changes with the appropriate ifdefs, like when selecting the CE2600 architecture. Hopefully it's not that hacky.

Some notes: Even if the I2C might work with some changes I made to the 6.6 tree, the lines are used for at least one LED, so.. I don't think they're using it.
SPI seems to be a little bit difficult to get it going, or at least, know where the pins are. There's a microchip POTS chip with two lines, which gets configured by SPI and uses I2S for audio. Not sure how the SoC exposes audio, couldn't find any drivers for that.

Comments will be more than well received! As usual, my changes are on https://github.com/cocus/openwrt/tree/testing-intelce

2 Likes

Well, I ditched the I2S and audio for this target, not gonna happen.

The only thing I'm missing is the upgrade. I understand that x86 doesn't have a way to be ugpraded, but this device doesn't behave like a normal x86 PC or server. I'm currently using the squashfs rootfs, and it creates an overlay RIGHT AFTER the last usable block of the squashfs image (up until the end of the partition). However, doing a manual upgrade with dd nukes this. You also need to dd the kernel to another partition.

As a side note, I've figured that using the switch in DSA mode, halves or even quarters the network throughput. For instance, without DSA, I get around 970mbps with iperf3 (which is okay). But with DSA directly on a port (i.e. not going though the bridge), I get around 400mbps. And finally, using a bridge on all DSA ports, yields around 300mbps. Because of this, I just tweaked the e1000 driver so it now accepts a parameter that disallows registering the marvell switch at all (of course, this only applies to the ce2600/intel puma target). If set to 0, it won't register the switch, and you'll get only a eth0, with the full throughput. If not, the switch gets registered as DSA, and you get lan1..4 (at the expense of losing throughput). This argument can be added to the cmdline, so it's trivial for anyone to change it if required.

But the main problem that I'm still having:
How can I leverage the sysupgrade capability to flash two partitions given a single package? It's literally the same idea behind flashing the eMMC on a router that has an eMMC (i.e. no UBI, no FTLs, etc).
Can anyone point to an example or what should I modify to get sysupgrade happy? And/or what to tweak on the makefiles to actually generate an image that sysupgrade likes?

Thank you!

1 Like

sysupgrade works normally (and just fine) on x86_64, for squashfs and ext4, for UEFI and BIOS images.

1 Like

How does that work? Which file do you feed into sysupgrade? My build only generates the squashfs.img and the kernel.bin. Am I missing a config?

And also, yes, that might work on UEFI/BIOS, but this is sadly not the case. I have my kernel on /dev/mmcblk0p1 (raw), and the rootfs on mmcblk0p3. No GRUB or similar. Which files should I look so I can change the behavior?

BusyBox v1.36.1 (2024-03-22 22:09:42 UTC) built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 23.05.3, r23809-234f1a2efa
 -----------------------------------------------------
root@OpenWrt:/# sysupgrade https://downloads.openwrt.org/releases/23.05.3/targets/x86/64/openwrt-23.05.3-x86-64-generic-squashfs-combined.img.gz
Downloading 'https://downloads.openwrt.org/releases/23.05.3/targets/x86/64/openwrt-23.05.3-x86-64-generic-squashfs-combined.img.gz'
Connecting to 146.75.122.132:443
Writing to '/tmp/sysupgrade.img'
/tmp/sysupgrade.img  100% |*******************************| 10224k  0:00:00 ETA
Download completed (10469466 bytes)
Mon Apr 29 00:43:42 UTC 2024 upgrade: Image metadata not present
Mon Apr 29 00:43:43 UTC 2024 upgrade: Reading partition table from bootdisk...
Mon Apr 29 00:43:43 UTC 2024 upgrade: Extract boot sector from the image
Mon Apr 29 00:43:43 UTC 2024 upgrade: Reading partition table from image...
Mon Apr 29 00:43:44 UTC 2024 upgrade: Saving config files...
Mon Apr 29 00:43:44 UTC 2024 upgrade: Commencing upgrade. Closing all shell sessions.
Command failed: Connection failed
root@OpenWrt:/# Mon Apr 29 00:43:46 UTC 2024 upgrade: Sending TERM to remaining processes ...
Mon Apr 29 00:43:50 UTC 2024 upgrade: Sending KILL to remaining processes ...
[   77.244347] stage2 (2784): drop_caches: 3
Mon Apr 29 00:43:56 UTC 2024 upgrade: Switching to ramdisk...
Mon Apr 29 00:44:01 UTC 2024 upgrade: Performing system upgrade...
Mon Apr 29 00:44:01 UTC 2024 upgrade: Reading partition table from bootdisk...
Mon Apr 29 00:44:01 UTC 2024 upgrade: Extract boot sector from the image
Mon Apr 29 00:44:01 UTC 2024 upgrade: Reading partition table from image...
Mon Apr 29 00:44:01 UTC 2024 upgrade: Writing image to /dev/sda1...
Mon Apr 29 00:44:02 UTC 2024 upgrade: Writing image to /dev/sda2...
dd: error writing '/dev/sda2': No space left on device
Mon Apr 29 00:44:02 UTC 2024 upgrade: Writing new UUID to /dev/sda...
[   83.414420] EXT4-fs (sda1): mounted filesystem without journal. Opts: (null). Quota mode: disabled.
Mon Apr 29 00:44:02 UTC 2024 upgrade: Upgrading bootloader on /dev/sda...
[   83.563544] EXT4-fs (sda1): mounted filesystem without journal. Opts: (null). Quota mode: disabled.
Mon Apr 29 00:44:03 UTC 2024 upgrade: Upgrade completed
[…]

Ok, I'll have a look, because I don't have that "combined" image:

$ ls bin/targets/x86/intelce/
config.buildinfo                             openwrt-x86-intelce-intelce-kernel.bin     openwrt-x86-intelce-intelce-squashfs-rootfs.img  sha256sums
feeds.buildinfo                              openwrt-x86-intelce-intelce.manifest       packages                                         version.buildinfo
openwrt-x86-intelce-intelce-ext4-rootfs.img  openwrt-x86-intelce-intelce-rootfs.tar.gz  profiles.json

Also, as far as I saw on that log, it switched to a ramdisk (because the source rootfs is going to be updated in-place), where does that one come from? Or is it a straight copy of the current rootfs but on ram?

It's a limited initramfs assembled (copied) from the running system. The log above was gathered from qemu(-kvm), so you can easily test it in a vm - but I do also use and regularly sysupgrade it on real (UEFI) x86_64 hardware.

Ok, I have the combined image. I understand it only creates two partitions, which are different than what I have configured on my device.
Also, I see something which is quite worrying, which is that if it detects that the partition layout changed, it'll write the entire image. That's NOT good, since the emmc contains the bootloader (i.e. the BIOS, not GRUB or any loader that's loaded by the BIOS/EFI). Not only that, but the offset to the first partition is also critical (because that's where the bootloader is stored).

You think that if I modify the way the gen_generic_image.sh and set the appropriate offsets, this would leavy my bootloader out of the equation? (i.e. not get overwritten under any circumstance). I also saw that the script will create a DOS (or ext4) partition for the kernel, which WON'T work on my device because the kernel is stored RAW on a partition.

Ok, finally, good news! I provided my own "platform.sh" for /lib/upgrade; and a custom "79_move_config" on /lib/preinit. I'm using the original "nvram" partition for the temporary storage of the config after a successful upgrade. Of course, the custom platform.sh does the heavylifting of converting the partition scheme that expects grub into the one on the modem, but it's just a simple matter of:

  • Assuming the first partition of the image contains an ext4/fat container where "boot/vmlinuz" exists.
  • Mounting that container, and dd'ng the contents of "boot/vmlinuz" to the first partition on the device (where the kernel is, and its offset is hardcoded on the bootloader script).
  • Using the second partition of the image as the rootfs, but writing it to the third partition on the device.
  • Profit!

I'm currently playing with the dsa marvell chip disabled, and using vlans on eth0 to get the full capabilities of the network, using the boot cmdline that disables the registering of the switch (but it's still present on the mdio bus, at address 4).

I'd like to open a PR for this, if it makes sense. I suspect that my .patch that adds support for this device on the kernel would require some "changes" to be up to OpenWRT's code standard, but I tried my best.

You think adding a PR on github is enough?