Fail to do syspupgrade on my own developing device, why?

Hi all,

On my own device, I try to do sysupgrade on my device, but it always shows

root@OpenWrt:/# sysupgrade -v -n /tmp/openwrt-sysupgrade.img 
Commencing upgrade. Closing all shell sessions.
Failed to exec upgraded.
Command failed: Unknown error

I try to debug and modify the code of procd. Print out the command and remove requirement of pid=1.

/sbin/upgraded /tmp/openwrt-sysupgrade.img /lib/upgrade/do_stage2 (null)

Try to run above, looks like upgraded can bring do_stage2.

root@OpenWrt:/# /sbin/upgraded /tmp/openwrt-sysupgrade.img /lib/upgrade/do_stage2
killall: telnetd: no process killed
do platform_pre_upgrade, kill daemons...
Switching to ramdisk...

My question is how do I to debug this problem?
I double check free memory(48MB), it should be okay(firmware size is 12MB).
Need a guy to guide me how to debug it. Appreciate. Thanks.

1 Like

Upgrading from what to what?

1 Like

you need to make it clear in the title/op that this is your own magic/attempted image generation...

no... don't do this... there are plenty enough hooks available to debug without messing with C and it's unlikely this is related to any issue in that space

Hi wulfy23,

First, I create sysupgrade.img like this

IMAGE/sysupgrade.img := rts-sysupgrade-tar | append-metadata

Since my kernel.bin is creating by python tool, so

define Build/rts-sysupgrade-tar
        sh $(TOPDIR)/scripts/sysupgrade-tar.sh \
                --board $(if $(BOARD_NAME),$(BOARD_NAME),$(DEVICE_NAME)) \
                --kernel $(KERNEL_BUILD_DIR)/vmlinuz.img \
                --rootfs $(KERNEL_BUILD_DIR)/root.$(FILESYSTEMS) \
                $@
endef

Could you please guide me how to use hooks to debug? Thanks.

possibly... the area you are talking about typically relates to target/files/LIB code... ( unless there is a fundamental resource issue )...

suggest you start looking at the cause not the symptom...

could you please fix your title

1 Like

If it is not okay to you, please fix it. Thanks.

1 Like

Is there any printable hooks I can use at script part or c code part? Thanks.

get a console and put 'set -x' at the top of 'do_stage2' and 'stage2' for a start...

1 Like
#!/bin/sh

set -x

. /lib/functions.sh
#!/bin/sh

set -x

. /lib/functions.sh
. /lib/functions/system.sh
root@OpenWrt:/lib/upgrade# sysupgrade /tmp/openwrt-sysupgrade.img 
Saving config files...
Commencing upgrade. Closing all shell sessions.
Failed to exec upgraded.
Command failed: Unknown error

Add set -x at do_stage2 and stage script, no extra log out.
Is there anything I can do more to debug? Thanks.

you are on a console?

Yes, I always work on console.

well you may as well throw one at the top of /sbin/sysupgrade then...

or you can try something like;

exec >>/dev/console 2>&1

but only to find your bug... (may or may not work)

other than that just echo the stuff to /dev/console where you want to verify... pretty basic stuff...

1 Like

Hi @anon50098793

I think I need your suggestion to dig in. Thanks.

  1. When error show up, I'm sure the last part at sysupgrade script is
ubus call system sysupgrade '{
                "prefix": "/tmp/root",
                "path": "/tmp/openwrt-sysupgrade.img",

                "backup": "/tmp/sysupgrade.tgz",
                "command": "/lib/upgrade/do_stage2",
                "options": {
                        "save_partitions": 1
                }
        }'

And, I can reproduce issue via key above command again.

  1. I try to print out what command execvp do, it is
/sbin/upgraded /tmp/openwrt-sysupgrade.img /lib/upgrade/do_stage2
  1. So, I replace /sbin/upgraded with
#!/bin/sh

set -x

echo okay > /dev/console

I try to run above command, I can see okay coming from my console.

  1. I try to do sysupgrade again, no okay show up !!!
Failed to exec upgraded.
Command failed: Unknown error

So, looks like execvp did not execute /sbin/upgraded, is there anything I can dig more? Thanks.

1 Like

By the way, I already test execvp example code on my device, it looks good.

#include <stdio.h>
#include <unistd.h>
 
 int main(int argc, char *argv[])
 {
    char *args[] = { "/sbin/upgraded", "/tmp/openwrt-sysupgrade.img", "/lib/upgrade/do_stage2", NULL};
    execvp(args[0], args);
     return 0;
 }
root@OpenWrt:/tmp# hello 
this tool needs to run as pid 1
1 Like

read my comments here

generally as a rough clue... most likely candidates are related to;

  • hanging processes
  • filesystem setup drama
  • botched environment
  • some whacky proc/<cgroup/kexec/selinux> type crap that is effecting the above...

the only thing that seems applicable to 'targeting' /sbin/upgraded is some sort of watchdog failure maybe...

but if you are 'stuck' on diggin deeper... you might have to
insert some C within it... really round about way of
extrapolating where you are falling down in a
target definition

for anything else your going to have to share everything you have and what you based it on... beats hypothesizing...

1 Like

When you say "I try to run above command", so you mean the ubus call? The sysupgrade script normally copies /sbin/upgraded to /tmp/root before the ubus call, which uses /tmp/root as a chroot. I suspect that part is working fine, you just need to do the copy manually first.

Worth noting too, that the Failed to exec upgraded line gets printed when upgraded returns (it expects it to upgrade and reboot), so even with your replacement that will still happen.

Yes, I run it manually. By the way, I know sysupgrade will install upgraded into /tmp/root.
So, I will remove the install command from sysupgrade, run ubus call manually and check the log.
Thanks.

Ah ok. Poking around the code it seems like either upgraded is not getting executed (unlikely), or something is failing in upgraded before (or just inside) the call to watchdog_init, and not printing anything.

This is based on seeing this when i run sysupgrade:

Thu Jan  1 00:00:22 UTC 1970 upgrade: Commencing upgrade. Closing all shell sessions.
Watchdog handover: fd=3
- watchdog -
...

Those watchdog lines come from watchdog_init.

Have you tried adding a printf very early in the existing upgraded code (at the top of main)? That might narrow it down.

I'd also print out the errno in case execvp fails in sysupgrade.c; that would be something like this:

if (execvp(argv[0], argv))
  fprintf(stderr, "upgraded error: %s\n", strerror(errno));

It's always possible for this kind of thing to be caused by hardware or some kind of messed up installation. You could try running it from failsafe or an initramfs (if you haven't already).

1 Like

Yes, I already tried it before, the log at first line of main never show up.

if (execvp(argv[0], argv))
fprintf(stderr, "upgraded error: %s\n", strerror(errno));

Yes, I already tried it, too, it always return -1.

You could try running it from failsafe or an initramfs (if you haven't already).

Thanks. I will try it later to see what happens. ^^

It would be helpful to see the actual error number (rather than the return value), as that gives more detail about what went wrong; it will be one of these: https://man.archlinux.org/man/execve.2#ERRORS

Looking at what execvp/execve does, i would definitely classify this as a problem with executing upgraded, as execve doesn't return if the program exits/crashes, only if it fails to execute.

1 Like