ImageBuilder make image segfaulting

I'm currently in the process of building newer snapshot images for Zyxel Armor Z2 (ipq806x) with the packages I want. However, during "Installing package" phase where the script downloads and installs the packages I specified, at some point the script would abort with a segfault. This never happened before and back then I was able to successfully build images from ImageBuilder.

The point when the segfault occurs is not random. It seems the ImageBuilder would segfault after installing some packages with a certain amount of size in total. Not sure how to debug this issue.

1 Like

Same here for other architectures also (mipsel_24kc, arm_cortex-a15_neon-vfpv4, mips_24kc) since yesterday. Seems like an error in packaging right after download has finished.

make[2]: *** [Makefile:161: package_install] Segmentation fault (core dumped)
make[1]: *** [Makefile:118: _call_image] Error 2
make: *** [Makefile:211: image] Fehler 2

Beside that it seems some packages are missing/not built e. g. at least stubby for my builds.

1 Like

Lots of packages seem to fail to build in buildbot.
Faillogs e.g. in

Hard to see what is the ultimate culprit change (in the dependency chain), but hostapd, openssl and cryptodev-linux failing are pretty core.

Nothing obvious jumps from the commits logs, but there have been several build system targeting patches from @nbd yesterday.

EDIT:
cryptodev-linux causes failure in openssl, which breaks hostapd, etc...

2 Likes

Indeed. I was not digging further. I just saw the error message and the message that there is no package "stubby" and tested 2 other architectures. Same Makefile error and stubby ofc.

Beside that. I have the feeling that ImageBuilder is less reliable for creating snapshot builds then using git and menuconfig. I'm using ImageBuilder since a few weeks only now and most of the time there is an error which makes the build process stopping completely or not finishing (I'm not changing packages or sth. like that!). Is this just temporary (bigger changes) or normal?

1 Like

I have never used the imagebuilder.
The full toolchain is easy to use once you get familiar with the basics.

Ps.
If you look at my community build threads, I have created nifty build scripts that streamline the build process with the full toolchain.
Scipts explained in

Me neither. I just started to use it because it is quite comfortable and fast to test a setup if you are not on the workstation hosting your build environment.
I think the Imagebuilder system is another layer which can have errors also so it is more error-prone then just using git. But that is just a guess.

1 Like

Not sure what's really going on behind this issue, but hope things can go back quick.

I mostly use ImageBuilder because the full toolchain looked a bit complicated to me... I used the full toolchain before but gave up shortly after because the package selection menuconfig was so huge that it would take forever for me to pick up every package I wanted by hand, and I don't know of an option to just pass the packages I wanted into the command line as I would with ImageBuilder...

On the other hand, with ImageBuilder, I noticed that the usual packages folder is absent (along with the necessary kmods). Could that be one of the causes of the segfault?

1 Like

There are a lot of packges not built or absent. It is unclear why atm (I think).

If you look into changelog of buildbot:

2 days ago Paul Spooren phase2: use full git history for reproducibility master commit | commitdiff | tree | snapshot
2 days ago Jo-Philipp... phase1: add separate option for kmod repo embedding commit | commitdiff | tree | snapshot

As there are a lot kmod's missing it could be phase1 related (which invoked some major changes?). It would fit into the timeframe (for me at least; 13.11.2020 was still working). But that is just speculation by myself (I don't have really a clue what is going on :smiley: ).

1 Like

More likely the kernel config symbol handling changes

My money is actually on
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=5d7606562940b52206712bb4bc274ad39521c3e1

due to cryptodev-linux.symvers': No such file or directory

Context

make[4]: Leaving directory '/builder/shared-workdir/build/sdk/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-armvirt_32/cryptodev-linux-cryptodev-linux-1.10'
mv: cannot move '/builder/shared-workdir/build/sdk/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-armvirt_32/cryptodev-linux-cryptodev-linux-1.10/Module.symvers' to '/builder/shared-workdir/build/sdk/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-armvirt_32/symvers/cryptodev-linux.symvers': No such file or directory
Makefile:58: recipe for target '/builder/shared-workdir/build/sdk/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-armvirt_32/cryptodev-linux-cryptodev-linux-1.10/.built' failed

That commit moved the mkdir -p $(PKG_SYMVERS_DIR) step. Possibly in a way that buildbot does not like (although the build may succeed on a unified build process in a local buildhost).

1 Like

segfault is also @libc ( install exception ) and was introduced ( at least the recent major triggering of it ) around the time of the wireguard commits... ( ~3-Nov )

pkg_hash_fetch_best_installation_candidate: libc arch=aarch64_cortex-a72 arch_priority=200 version=1.1.24.
pkg_hash_fetch_best_installation_candidate: Candidate: libc 1.1.24.
pkg_hash_fetch_unsatisfied_dependencies: satisfying_pkg=0x7f9aff5f6a00
pkg_hash_fetch_unsatisfied_dependencies: satisfying_pkg=(nil)
pkg_hash_fetch_unsatisfied_dependencies: satisfying_pkg=(nil)
Makefile:157: recipe for target 'package_install' failed
make[2]: *** [package_install] Segmentation fault
Makefile:112: recipe for target '_call_image' failed
make[1]: *** [_call_image] Error 2
Makefile:210: recipe for target 'image' failed
make: *** [image] Error 2
1 Like

Not sure what happened in the last few days, first the OpenWrt downloads server went down, and after it went back up this happened (the ImageBuilder broke).

As of today it seems the kmods (the packages folder) are still not present in the ImageBuilder archives... guess I won't be able to make any new builds this way for the time being.

Will keep an eye on any updates regarding this.

1 Like

this is by design, it is no longer created... so that's 'normal' ( now )

2 Likes

I wonder if there is nobody looking into it? Still building useless output. It is not that worse like last time where the built images were bricking the whole device (because it is not finishing). But if a lot of things are going wrong/failing (on a server) I would assume that there is one getting an email and at least stopping this. Is there no snapshotting on the server itself available like to go 3 days back and provide this state online to fix the most recent state in order to bring it back later? Comming over a feeling that using buildbot is useless for snapshots and me. :smiley:

1 Like

So with the packages folder gone, how should I put additional packages (that aren't part of any repository) I wanted to include for ImageBuilder from now on?

On the other hand, guess I'll have to wait for the issues resolved before I could build updated images again...

Not sure what happened but I'm suspecting the last outage of the downloads server might be related, as the ImageBuilder was said to be fine until Nov 13, which was just before the OpenWrt downloads server went down (that would be Nov 14 in the place I live, when I noticed).

2 Likes

good question...

hopefully someone will document all the new changes once they are complete? like you ( everyone else? )... i'm just trying to work things out via usage... which does get tricky when things are broken...

so long as nobody's router is getting bricked... i'm ok with these issues... so long as the end product is one that is better ( or at least equivalent )... they'll probably be fixed in a day and we'll never see them discussed again?

1 Like

my use case for community builds is that I use the imagebuilder once... from this point on... i 're-use' a self-contained imagebuilder which for the 'new' format, seems to work well with...

repositories.conf

src imagebuilder file:FOLDERNAME #(copy of ./dl)

I can then rebuild exactly the same image with the same or less packages without having to rely on the upstream servers...

peaks in the graph above are representative of buildsystem / imagebuilder issue dates over the last month... the community build seems to serve to offset official issues... and provide a fallback when peoples devices are not supported via official release...

1 Like

Just putting my 2 cents worth in here. Imagebuilder building was broken for me when I tried it for building from the raspberry pi 4 "bcm2711" snapshot yesterday. I opened a github issue, not sure if that was the right place. Just hope a fix is found soon :slight_smile:. The Imagebuilder process ended with the following error and exited:
Configuring kernel.
make[2]: *** [Makefile:161: package_install] Segmentation fault (core dumped)
https://github.com/openwrt/packages/issues/13936

1 Like

????
AFAIK, design has not changed.

There is a bug related to the package kernel symvers handling, like I diagnosed above.

There is now a proposed patch for the issue, but it has not yet been merged:

2 Likes

@aparcar could probably explain it better as he made the change... his reasoning was to save space? or something to do with a more dynamic... 'consistent' package set via kmods over the internet...?

Not quite sure what you mean, as kmod checksums per build prevent any "dynamic" content.

(But at kmods are currently stored per build to the download server, it might be possible to drop kmods from the imagebuild file and download them on the fly during the imagebuilder usage. Might be that kind change that you refer to.)

But the current problem is related at least to the symvers file hadnling, which has been much more recent change.

That has been broken for a week, but now there is a proposed patch for it.

1 Like