Mvebu testing kernel failure from 6.6 to 6.6

Looking for some guidance running into a build error updating from testing kernel 6.6.x of a month ago to latest 6.6.29 that has me stumped.

I run two testing kernel builds. One for an X86/64 target and another on an Mvebu target - identical build structure (sort of a failsafe measure if one or the other device has issues/fails).

I do a make targetclean before each build. The X86/64 target from 4/29 builds fine, but the mvebu target of 4/30 fails at the build completion.

Summary of make -j1 V=sc

Summary

make -f ./scripts/Makefile.vmlinux_o
make -f ./scripts/Makefile.modpost
make -f ./scripts/Makefile.vmlinux
make[5]: Leaving directory '/home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/linux-6.6.29'
find /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/linux-6.6.29 /home/user/3-Development/openwrt/staging_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/root-mvebu/lib/modules -name *.ko | xargs arm-openwrt-linux-muslgnueabi-nm | awk '$1 == "U" { print $2 } ' | sort -u > /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/mod_symtab.txt
arm-openwrt-linux-muslgnueabi-nm -n /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/linux-6.6.29/vmlinux.o | awk '/[1]+ [rR] _ksymtab/ {print substr($3,11)}' > /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/kernel_symtab.txt
grep -Ff /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/mod_symtab.txt /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/kernel_symtab.txt > /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/sym_include.txt
make[4]: *** [Makefile:24: /home/user/3-Development/openwrt/build_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/linux-mvebu_cortexa9/symtab.h] Error 1
make[4]: Leaving directory '/home/user/3-Development/openwrt/target/linux/mvebu'
make[3]: *** [Makefile:11: install] Error 2
make[3]: Leaving directory '/home/user/3-Development/openwrt/target/linux'
time: target/linux/install#3.01#0.72#3.76
ERROR: target/linux failed to build.
make[2]: *** [target/Makefile:32: target/linux/install] Error 1
make[2]: Leaving directory '/home/user/3-Development/openwrt'
make[1]: *** [target/Makefile:26: /home/user/3-Development/openwrt/staging_dir/target-arm_cortex-a9+vfpv3-d16_musl_eabi/stamp/.target_install] Error 2
make[1]: Leaving directory '/home/user/3-Development/openwrt'
make: *** [/home/user/3-Development/openwrt/include/toplevel.mk:233: world] Error 2
user@Vostro-7620:~/3-Development/openwrt$

My bin/targets/mvebu/cortexa9 gets populated with everything up to the .bin and .img files.

I’m an opcode guy and my C skills are next to none, but this appears to in the linux kernel itself. Simply removing the use testing kernel from make nconfig let’s the build complete for kernel 6.1.


  1. 0-9a-f ↩︎

That target does not cleanup tmp/, maybe try a dirclean or a
rm -rf bin/ tmp/
short circuit; might have to also get the toolchain directory if the above is not enough.

mvebu is building fine for me; by way of an FYI, target vfpv3 v vfpv3-d16 would be optimum as an architecture choice.

1 Like

I'm on 6.6.29 as well with mvebu, but I do a rather frequent make clean on main, since things are in flux. Easier to do that and rebuild your staging_dir than to start looking for what exactly is breaking why; in the end it often does need a thorough clean and recompile anyway :upside_down_face:.

Edit: only a make dirclean will wipe the staging_dir AFAIK, sorry.

I use rm -rf bin build_dir tmp for my basic clean, and recently added make targetclean to make sure the toolchain gets wiped. I’ll do a dirclean instead on my next attempt. Thanks.

Thanks again. Never thought of that. I’ll change cpu type as well.

Yes, I got caught recently because my toolchain was stale.

Thanks to both you gents. I didn’t see any reported issues and just a normal rm -rf bin build_dir tmp and pointing away from the testing kernel really didn’t make sense when it built fine. I’ll see what tomorrow brings. :grin:

After following your advice doing a thorough Clean of my build environment, I ended up back in the exact state of failure. :disappointed:

The build fails before it can assemble kernel.bin, sysupgrade.bin, factory.img, and manifest files. But it does provide the packages dir as well as the .buildinfo files before it fails. And again removing the use testing kernel option provides a full 6.1 working build.

Desperate options ensued, and I blew away the environment and did a git clone, copied in my backup .config from my first 6.6 build and got exactly the same behaviour. :disappointed: X2+

On a whim I ran a ./scripts/diffconfig.sh > configdiff. It failed at line 32 of scripts/kconfig.pl. Replacing my last working 6.1 backup config diffconfig.sh was successfully and setting use testing kernel finally produced a 6.6 build.

I still don’t have a clue why my .config that originally produced a working 6.6 build now fails miserably to produce an updated 6.6 (but still works to produce a working 6.1 build), but I guess I’ll close this as ? ? ? .

@anomeome

I figured I could just mod CONFIG_TARGET_ARCH_PACKAGES="arm_cortex-a9_vfpv3-d16" CONFIG_CPU_TYPE="cortex-a9+vfpv3-d16"
but the build system just reverts back. Can you throw an old dog a bone?

The patch I use:

build.patch
diff --git a/target/linux/mvebu/cortexa9/target.mk b/target/linux/mvebu/cortexa9/target.mk
index 02697fa62d..dd70acf1aa 100644
--- a/target/linux/mvebu/cortexa9/target.mk
+++ b/target/linux/mvebu/cortexa9/target.mk
@@ -7,5 +7,5 @@ include $(TOPDIR)/rules.mk
 ARCH:=arm
 BOARDNAME:=Marvell Armada 37x/38x/XP
 CPU_TYPE:=cortex-a9
-CPU_SUBTYPE:=vfpv3-d16
+CPU_SUBTYPE:=vfpv3
 KERNELNAME:=zImage dtbs

will probably require a make dirclean for a clean getgo.

I’m feeling more than a little foolish. I remember looking at your g drive a while back when @Borromini had a PR to move to 6.6 and seeing this. Either way it’s still a very nice bone.

I have this exact error, and unfortunately restoring an old diffconfig that works with 6.1 doesn't fix the issue. Now that 6.6 is "official" it seems I need to dig into this.

The exact failure mechanism is null output from this command:

arm-openwrt-linux-muslgnueabi-nm -n /home/ia/git/openwrt/build_
dir/target-arm_cortex-a9+vfpv3_musl_eabi/linux-mvebu_cortexa9/linux-6.6.30/vmlinux.o | awk '/^[0-9a-f]+ [rR] __ksymtab_/ {print substr($3,11)}'

My awk-fu is lacking. There are a few occurrences of __ksymtab_gpl in the vmlinux.o file that would seem to be related, but what awk is looking for and why it doesn't find it is a mystery to me.

Fixed, for me, as of this commit.

Curious how my kernel config options differ making this matter. I did a cursory look and the only real difference that stands out is this: CONFIG_COLLECT_KERNEL_DEBUG=y (mine is not set)

1 Like
CONFIG_STRIP_KERNEL_EXPORTS=y

borked things with 6.6.x builds, removed until resolved.