Occasional unexplained meltdown during OpenWrt build

I have for some time been experiencing a total meltdown that occurs during an OpenWrt build. About one in ten times, I'd estimate, a build of OpenWrt which is interrupted for one reason or another, when restarted will decide to commit suicide. Today I actually caught it while it was happening:

That was during a make -j 6 world, it just decided to erase the entire toolchain, staging, and build dirs.

So, what happens is usually this:

  1. make -j $(nproc) world
  2. Some build stoppage or error
  3. Diagnose / correct build error
  4. make -j $(nproc) world
  5. meltdown

I have not noticed anything consistent as to what I do in step 3 to cause it. Editing a makefile, or making a tweak I've done five times before, but then this time the subsequent make world just causes the build tree to become fundamentally fed up with life.

The funny thing is this: it happens enough that during a critical build, before I restart after a stoppage, I will sometimes clone the build tree cp -a openwrt openwrt_bak right before restarting the build. And I have seen where it melts down, I restore from backup, do the exact same make -j $(nproc) world and it works fine.

Has anyone else noticed this behaviour?

No I have never seen this, but I have stopped using -j a long time ago since it’s often a problem with download errors in some way or another.

Did you make a clean command before #1 and between #3 and #4?

What kind of computer do you use to build with, and is that upgraded?

No.

I always start a build with make -j $(($(nproc)+1)) , but FWIW I always restart any abend with make -j1 V=sc through to completion.

Never noticed it either. Is this running on some kind of network mount? Are the filesystem timestamps stable?

I've only built from source once or twice when I needed to build a newer dnsmasq-full for 22.03, but I did notice that with -j $(($(nproc)+1)) it's prone to errors during build. I thought it may have been something on my end so I never bothered to take extra notes/exact steps to duplicate.

There shouldn't be any generic widespread problem with parallel builds. Building with j4 works normally quite ok for me, thousands of builds done in the last 13 years :wink:

(I always start a build from a clean slate, after make clean.)

Hannu, I've clarified my earlier statement. Haven't tried with -j4. Will try it next time building from sources.

Yes, but it's an NBD mount. There should be no timestamp issues, since it just appears in all respects as a full ext4 filesystem. Any network issues would be very evident in the kernel ring buffer log.

Before #1 it's generally starting fresh.
Between #3 and #4, no. I am doing a full all-package build, and with master there is almost inevitably one or two packages that are currently failing. I don't like to build with -i because then I don't see if there is a really necessary package that fails.

If I did a "make clean" every time a package failed and I had to take remedial action, I'd never finish a build. That would be another full day of building each time.

The PC itself is slightly older, but was top of the line. Alienware M17 R2. It had a professional repaste and runs quite cool. It's running Linux Mint 21.1 (Ubuntu 22.04).

Why? No device will ever use all packages anyway because no device has the memory for it, except the x86/64 device.

And the built bin folder will end up empt if you build it bigger that the chosen device available memory, since that image will be corrupt anyway if you try installing it.

And you can only use the packages for the build you build. Next build need new packages.

But you should have a pretty good understanding after some time using OpenWrt what packages you use and need for the build.