How to find a clue about why OpenWRT on Proxmox does crash

Hello,
I run OpenWRT on Proxmox, that setup did run for many years great but the snapshots that I build crash daily since around two months. Where do I find logs or anything that would give me a clue about why OpenWRT did crash?
OpenWRT does restart itself and works then again flawless until it crashes again.

Logs are not stored permanently, you can log to remote syslog server (on same proxmox)

Just one shot into the dark, after all crystal balls are still held up in supply chain issues, but did you assign enough RAM for the OpenWrt VM (I would not go below 256 MB for a production system on x86_64, more if needed (demanding packages installed)).

VM has 2 cores and 2 GB RAM and 400 MB of that 2 GB are only used. Used diskspace is 250 MB of 1000 MB, so the reason should not be insufficient resources.

Try using stable to see if it's getting better first?

Stable 23.05.3 on proxmox has been running fine for me. It is a simple setup. I could try the snapshot if devs would like. Just let me know what kind of setting you would like me to use.

You might try sshing to your openwrt and running "logread -f". Be sure once it crashes you can scroll back on that vm. You might see something just before the network goes down.

Good luck

You are OK, BigG should bisect his diffconfig towards breaking end.

Ok so I catched the crash/reboot in the console. The screen just gets blacks then OpenWRT boots again, no bios screen and no grub while that happens. Nothing get written while having logread -f running, also in syslog nothing is written.

I don't know what bisect means, if you can link me to a explaination or explain it shortly I can provide you with what you want.

Check diffconfig.sh output, enable half of options you changed - build - if it crashes enable other half only, and so on drill to exact options that makes kernel crash

That is confusing me, if I do that I would build for the default architecture which is not x86, so I would build a image I couldn't even boot?

Last time I checked proxmox did not support any other than x86/64

That's a new one, bisect packages included in an image to find what is borked, seems rather untenable as a solution resolution; a bunch of images that have no chance at running successfully.

Edit: yes, if checking commits, you are talking about a diffconfig...

It is log2(changed_options) to find each flipswitch.....

I finally catched the crash with logs.

Thu Jun 27 16:14:05 2024 kern.emerg kernel: [19763.083419] skbuff: skb_under_panic: text:ffffffff81adb279 len:389 put:82 head:ffff888025dde000 data:ffff888025dddfe4 tail:0x169 end:0x4c0 dev:eth5
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.086644] ------------[ cut here ]------------
Thu Jun 27 16:14:05 2024 kern.crit kernel: [19763.088102] Kernel BUG at skb_panic+0x4a/0x50 [verbose debug info unavailable]
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.090060] invalid opcode: 0000 [#1] SMP PTI
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.091412] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O       6.6.35 #0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.093456] Hardware name: Hardkernel Odroid H2+, BIOS 4.2023.08-4 02/15/2024
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.095349] RIP: 0010:skb_panic+0x4a/0x50
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.096627] Code: 48 70 57 8b b8 b8 00 00 00 57 8b b8 b4 00 00 00 57 48 c7 c7 c0 9e 50 82 ff b0 c8 00 00 00 4c 8b 88 c0 00 00 00 e8 16 e4 73 ff <0f> 0b 90 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.101308] RSP: 0018:ffffc90000003590 EFLAGS: 00010292
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.102843] RAX: 0000000000000087 RBX: ffff88800b736d00 RCX: 0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.104774] RDX: ffff88807c01f2a0 RSI: ffff88807c01d580 RDI: ffff88807c01d580
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.106686] RBP: ffffc900000035b0 R08: 0000000000000000 R09: ffffc90000003430
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.108601] R10: 0000000000000003 R11: ffffffff826aef68 R12: ffff88800b736300
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.110508] R13: ffff88800b736d00 R14: 0000000000000000 R15: 0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.112420] FS:  0000000000000000(0000) GS:ffff88807c000000(0000) knlGS:0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.114519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.116131] CR2: 00007f885d5a8000 CR3: 000000000b458000 CR4: 0000000000350ef0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.118022] Call Trace:
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.119005]  <IRQ>
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.119899]  ? show_regs+0x60/0x70
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.121067]  ? die+0x32/0x90
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.122127]  ? do_trap+0xf7/0x100
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.123275]  ? do_error_trap+0x6c/0x90
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.124505]  ? skb_panic+0x4a/0x50
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.125663]  ? exc_invalid_op+0x4f/0x70
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.126912]  ? skb_panic+0x4a/0x50
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.128078]  ? asm_exc_invalid_op+0x1b/0x20
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.129392]  ? skb_panic+0x4a/0x50
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.130549]  skb_segment_list+0x40c/0x470
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.131823]  __udp_gso_segment+0x359/0x570
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.133125]  udp4_ufo_fragment+0x13a/0x180
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.134423]  inet_gso_segment+0x147/0x3a0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.135687]  ip4ip6_gso_segment+0x29/0x30
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.136958]  ipv6_gso_segment+0x19c/0x570
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.138236]  skb_mac_gso_segment+0x82/0xe0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.139516]  __skb_gso_segment+0xb0/0x160
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.140776]  validate_xmit_skb.isra.0+0x15a/0x2a0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.142172]  validate_xmit_skb_list+0x41/0x70
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.143516]  sch_direct_xmit+0x146/0x280
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.144756]  __dev_queue_xmit+0x757/0xbc0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.146009]  ip6_finish_output2+0x2b5/0x600
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.147321]  ? nf_ct_deliver_cached_events+0x69/0x90 [nf_conntrack]
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.149033]  ? nf_confirm+0x223/0x2a0 [nf_conntrack]
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.150501]  ip6_finish_output+0x122/0x370
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.151784]  ? nf_hook_slow+0x3c/0xc0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.152988]  ip6_output+0x63/0x100
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.154142]  ? __pfx_ip6_finish_output+0x10/0x10
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.155546]  ip6_local_out+0x3e/0x60
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.156748]  ip6_tnl_xmit+0x72b/0x2290 [ip6_tunnel]
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.158175]  ip6_tnl_xmit+0xd18/0x2290 [ip6_tunnel]
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.159591]  dev_hard_start_xmit+0xa4/0x100
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.160858]  __dev_queue_xmit+0x1e4/0xbc0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.162092]  ? nft_do_chain+0xf6/0x630 [nf_tables]
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.163480]  neigh_connected_output+0xc8/0xf0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.164769]  ip_finish_output2+0x1b0/0x550
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.165998]  __ip_finish_output+0x9a/0x130
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.167229]  ip_finish_output+0x31/0xf0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.168378]  ip_output+0x49/0xb0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.169404]  ? __pfx_ip_finish_output+0x10/0x10
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.170672]  ip_forward_finish+0x92/0xb0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.171827]  ip_forward+0x3ee/0x430
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.172890]  ? __pfx_ip_forward_finish+0x10/0x10
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.174165]  ip_rcv+0xd8/0xe0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.175117]  ? __pfx_ip_rcv_finish+0x10/0x10
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.176326]  __netif_receive_skb_one_core+0x66/0x70
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.177639]  process_backlog+0x93/0x1b0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.178749]  __napi_poll+0x29/0x170
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.179781]  net_rx_action+0x12e/0x260
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.180869]  handle_softirqs+0xc9/0x210
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.181978]  irq_exit_rcu+0x47/0x70
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.183018]  common_interrupt+0x86/0xa0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.184128]  </IRQ>
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.184899]  <TASK>
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.185679]  asm_common_interrupt+0x27/0x40
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.186856] RIP: 0010:pv_native_safe_halt+0x13/0x20
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.188167] Code: 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 8b 05 aa da db 00 85 c0 7e 07 0f 00 2d df f2 40 00 fb f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.192557] RSP: 0018:ffffffff82603e40 EFLAGS: 00000246
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.193971] RAX: 0000000000000000 RBX: ffffffff82608d80 RCX: 4000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.195768] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000bddedf4
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.197569] RBP: ffffffff82603e48 R08: 0000000000000000 R09: 0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.199375] R10: 0000000000000000 R11: 0000000000089641 R12: 0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.201177] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88807eee8080
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.202984]  ? default_idle+0x9/0x20
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.204115]  arch_cpu_idle+0x9/0x10
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.205220]  default_idle_call+0x23/0x40
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.206404]  do_idle+0x17a/0x180
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.207453]  cpu_startup_entry+0x25/0x30
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.208630]  rest_init+0xad/0xb0
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.209673]  arch_call_rest_init+0x9/0x30
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.210865]  start_kernel+0x49b/0x730
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.211986]  x86_64_start_reservations+0x18/0x30
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.213301]  x86_64_start_kernel+0x7e/0x80
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.214519]  secondary_startup_64_no_verify+0x178/0x17b
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.216955]  </TASK>
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.217796] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_inet wireguard pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack_netlink nf_conntrack mlxsw_spectrum libchacha20poly1305 curve25519_x86_64 chacha_x86_64 cdc_ncm cdc_ether act_sample zstd usbnet tcp_bbr slhc r8152 r8125(O) psample poly1305_x86_64 parman objagg nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mlxsw_pci mlxsw_minimal mlxsw_i2c mlxsw_core mlxfw lzo_rle lzo libcurve25519_generic libchacha crc_ccitt sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact evdev i2c_dev dwmac_intel dwmac_generic stmmac_platform stmmac sctp ip6_tunnel tunnel6 ip_tunnel veth tun nls_utf8 pcs_xpcs vxlan udp_tunnel ip6_udp_tunnel
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.217898]  sha256_ssse3 sha256_generic libsha256 md5 hmac crypto_acompress nls_iso8859_1 nls_cp437 vfat fat btrfs zstd_decompress zstd_compress zstd_common xxhash xor raid6_pq lzo_decompress lzo_compress libcrc32c button_hotplug(O) phylink mii libphy
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.243717] ---[ end trace 0000000000000000 ]---
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.245209] RIP: 0010:skb_panic+0x4a/0x50
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.246584] Code: 48 70 57 8b b8 b8 00 00 00 57 8b b8 b4 00 00 00 57 48 c7 c7 c0 9e 50 82 ff b0 c8 00 00 00 4c 8b 88 c0 00 00 00 e8 16 e4 73 ff <0f> 0b 90 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.251520] RSP: 0018:ffffc90000003590 EFLAGS: 00010292
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.253172] RAX: 0000000000000087 RBX: ffff88800b736d00 RCX: 0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.255170] RDX: ffff88807c01f2a0 RSI: ffff88807c01d580 RDI: ffff88807c01d580
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.257168] RBP: ffffc900000035b0 R08: 0000000000000000 R09: ffffc90000003430
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.259232] R10: 0000000000000003 R11: ffffffff826aef68 R12: ffff88800b736300
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.261297] R13: ffff88800b736d00 R14: 0000000000000000 R15: 0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.263314] FS:  0000000000000000(0000) GS:ffff88807c000000(0000) knlGS:0000000000000000
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.265510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Thu Jun 27 16:14:05 2024 kern.warn kernel: [19763.267264] CR2: 00007f885d5a8000 CR3: 000000000b458000 CR4: 0000000000350ef0

eth5 is a virtio NIC
Timing wise it the start of the crashes started with OpenWRT moving to kernel 6.6 and me getting a new modem, so not sure which of it is the cause, but since I can't controll modem behaviour I need to find a solution in OpenWRT anyway

It looks like you are on an 0droid. Searching on proxmox forums found this.
https://forums.servethehome.com/index.php?threads/jasper-lake-proxmox-kvm-qemu-vm-guest-stability.38824/.

I haven't read all of it but might be worth a look.

1 Like

Thank you I will try that

Sadly it looks like that doesn't apply to my setup, I'm on a Odriod H2+, so Intel J4115 and the newest BIOS.

You were right with the hint but none of the fixes in that thread did work for me. I had to add intel_idle.max_cstate=1 to the parameters of GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub on the Proxmox host.
That solution is not very nice because it does tripple the power consumption of that system, so if somebody does know a better workaround I'm happy to try that instead.

Could you check if Proxmox has installed and loads intel-microcode?

dmesg | grep microcode

[ 1.324241] microcode: Current revision: 0x00000040