Building OpenWrt with 16 parallel jobs

hello,
i have an AMD Ryzen with 8 cores / 16 threads.
The problem is that if I build with 8 jobs it works, if I use 16 threads/jobs, it fails.
How can I fix this?

Different place every time right?

Sequencing

make -j16 clean download world 

Might be something else, but I know -j12 works successfully on a 12-core CPU.

1 Like

i try make download first, i left this out, i was fetching the download before the make, now i run make download and then make. i hope that will work.

world: means it builds everything?

Yes, the canonical target that gets built if you just execute make

1 Like

i often found that making download would not download all required sources, and build would fail in the middle of the process. just going online again and running make again would dl required source and build continues. may be, that the problem was i used parallel jobs to download too...

1 Like

maybe execute make download twice? would that worth it doing it?

at least once i tried that too, and no other sources were downloaded on 2nd attempt

1 Like

i have to say AMD is less stable as INTEL, i never even used make download and it was perfect.
On AMD, always failures, restart jobs etc... On INTEL, I run once and it is close to perfect. I only have like 10-15 failed packages (and about another 15 NodeJs packages, but it is not a problem). With AMD, it is giving even more failed packages and sometimes have to restart the jobs and I ran make download twice!

not sure because i use more threads i dunno, but that should not be the failures.

I've been using

$ make -j $((1+`nproc`))

which has been working quite well for my recent build attempts on my i7-4790k (4-core, 8-thread), so it runs 9 threads in my case. I just saw a lot around the web when researching optimal job numbers, with several people recommending number_of_cpu_threads + 1. I also tried 1.5 * threads mentioned a few times but it bogged the system down way too much at that point. I did load up a crazy high powered AWS instance for an hour or so earlier to try some builds with ~72 cores or something, and it worked, so the 16 cores certainly shouldn't be an issue?

The big reasons I've encountered to run make download before running make -jX... is if I download with threads it seems to forget how to download packages after a few minutes in the middle of a build (while I can still ssh/http to it) and tries to reference packages before they're built and stuff, eventually timing out. Anyways, I've been running pretty much these commands from a clean checked-out openwrt/openwrt.git (or I've been building off lede-project/source.git, idk if it makes a big difference). And a .config based off https://dc502wrt.org/releases/config.seed for my WRT3200ACM, with lots of the stuff I don't use disabled and a few other things added.

$ cd $SRC_ROOT && git pull   # I've been playing with building against lede:master/HEAD lately so yeah
$ cp ~/config.seed $SRC_ROOT/.config && make defconfig  # restoring my saved device customizations
$ ./scripts/feeds update -a && ./scripts/feeds install -a  # update/installs feeds
$ make menuconfig  && scripts/diffconfig.sh > $HOME/config.seed  # last minute changes, and make sure target is setup/numbered properly, and backup config
$ make download  # download configured packages/modules single thread so it doesn't timeout or go into race conditions
$ make -d V=sc -j $((1+`nproc`)) 2>&1 1> build.log | tee -a build.log  # build verbose with lots of jobs
          # and save stdout+stderr > build.log and print just stderr messages to screen 

I was having a ton of trouble yesterday getting a build working, and I think after a bit of trial and error and lots of research the above commands seem to be pretty much an example working build process from lede-project/source:master/HEAD. Hope someone finds it useful as a reference or something :stuck_out_tongue:

Also BTW: if you're making lots of builds all the time, check out ccache. It sort of saves a cache of the build so even after a make distclean recently built packages are still more quickly available

1 Like

ok, i restart the build and the download will be serial and only the build will use parallel. thanks for the info.

yes, i can see serial download is safer. it was many timeouts and try-s, but finally for example fontconfig it downloaded, with parellal 16 threads it was not able to downloading.

the only problem is that it takes 1 hour to download all packages, but now i play safe.

well, the downloding takes 35 minutes, now lets see with 16 parallel jobs can do.

this AMD is not stable, I try an Intel 7700k build, i am sure it will work it once.
AMD is good with many cores, but for some reason it is not stable.

Slow down and collect your thoughts, no need to shoot out multiple one-sentence posts with little info on a slow forum like this.

It takes a while for them to download, but once it is, they should stay for the next build, even with a make clean I believe (?), but they would get removed with make dirclean and make distclean. It's been taking me under 5 minutes to get/check the downloads, and a solid 3-6+ hours to build the image, but I'm still trying to tweak and optimize and stuff.

Try it with the 16 jobs, but know it still might fail. Read what it says and turn up verbosity. Run make with make -d V=sc ... | tee build.log and it should show you which package and what line failed, and you can further troubleshoot from there. Maybe you're missing a dependency, maybe its an OpenWRT bug, maybe its a kernel bug. You haven't really given any explanation on what is failing, no error messages or logs or anything, so we can't really help you lol. Like, 'amd not working' doesn't help anybody help you.

I've always found builds fail more often on dirty environments. When in doubt, backup your .config, run a distclean or reclone the repo, and try fresh. Maybe a half built package is causing trouble. I had an issue this morning where the filesystem got corrupted for a few files and couldnt remove them so the builds were failing. I mean it could be anything, not even AMD related lol. Maybe try checking out a known stable version instead of master to build, that you know should work to make sure your build environment is working first?

i am always running clean as i use in a docker container and i start from there. but i am testing, maybe it is a kernel problem, but on my 7700k debian buster/testing is fails to make about 10-15 packages bad and i do not even need those, but on AMD and same docker container i built, it fails totally exits. that is why something is with the amd/ryzen 1700 laptop i have. i just it would build faster than 4 hours with 16 threads vs instead intal 7700k 8threads, which is actually awesome, i can built everything in 4 - 4.5 hours and in 1 run and no crucial error. and as well instead of 8 threads i use 9.
just testing. parallel is a little picky.

You can also run make with the argument --keep-going, which will try to continue on errors if it can. This would help you be able to get the bulk of the compilation done with, and the few packages that failed could be focused on or updated, or removed if not even needed

There are concurrent failures on Intel as well. If a concurrent build fails, then I see the same error every time: The present kernel configuration has modules disabled.. A single threaded build seems to always work.

*
* Restart config...
*
*
* File systems
*
Second extended fs support (EXT2_FS) [N/m/y/?] n
The Extended 3 (ext3) filesystem (EXT3_FS) [N/m/y/?] n
The Extended 4 (ext4) filesystem (EXT4_FS) [Y/n/m/?] y
  Use ext4 for ext2 file systems (EXT4_USE_FOR_EXT2) [Y/n/?] y
  Ext4 POSIX Access Control Lists (EXT4_FS_POSIX_ACL) [N/y/?] n
  Ext4 Security Labels (EXT4_FS_SECURITY) [N/y/?] n
  Ext4 Encryption (EXT4_ENCRYPTION) [N/y/?] n
  EXT4 debugging support (EXT4_DEBUG) [N/y/?] n
JBD2 (ext4) debugging support (JBD2_DEBUG) [N/y/?] n
Reiserfs support (REISERFS_FS) [N/m/y/?] n
JFS filesystem support (JFS_FS) [N/m/y/?] n
XFS filesystem support (XFS_FS) [N/m/y/?] n
GFS2 file system support (GFS2_FS) [N/m/y/?] n
Btrfs filesystem support (BTRFS_FS) [N/m/y/?] n
NILFS2 file system support (NILFS2_FS) [N/m/y/?] n
F2FS filesystem support (F2FS_FS) [Y/n/m/?] y
  F2FS Status Information (F2FS_STAT_FS) [Y/n/?] (NEW) aborted!

Console input/output is redirected. Run 'make oldconfig' to update configuration.

scripts/kconfig/Makefile:38: recipe for target 'silentoldconfig' failed
make[7]: *** [silentoldconfig] Error 1
Makefile:525: recipe for target 'silentoldconfig' failed
make[6]: *** [silentoldconfig] Error 2

The present kernel configuration has modules disabled.
Type 'make config' and enable loadable module support.
Then build a kernel with module support enabled.

Makefile:1293: recipe for target 'modules' failed
make[5]: *** [modules] Error 1
Makefile:21: recipe for target '/build/19.07/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/linux-mvebu_cortexa9/linux-4.14.131/.modules' failed
make[4]: *** [/build/19.07/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/linux-mvebu_cortexa9/linux-4.14.131/.modules] Error 2
Makefile:13: recipe for target 'compile' failed
make[3]: *** [compile] Error 2
time: target/linux/compile#3.21#0.93#24.98