Bridge MAC unstable - no address / default route

I've encountered a situation where the DHCP-address does not stay on the interface when the MAC address changes during bridge setup.
The default route leaves as well. No replacements are added even though there's a new DHCP lease.
The behaviour is reproducible (/etc/init.d/network restart or reboot behave the same).
For the test, openwrt trunk (f57230c4e6ee5af36d22bc0bef0bf7adc583c5b0) is used.

Fixing the MAC address resolves the issue. This post describes the error, the next post describes the workaround.
I've chosen not to obfuscate the MAC addresses.

These are address and routes on the openwrt device in the non-working state:

[root@50024 ~] $ ip a show br-n_1
5: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 02:fe:ed:3b:29:8f brd ff:ff:ff:ff:ff:ff
    inet6 2002:5476:3f1:0:12fe:edff:fe3b:2990/64 scope global dynamic noprefixroute
       valid_lft 7155sec preferred_lft 3221sec
    inet6 fe80::12fe:edff:fe3b:2990/64 scope link
       valid_lft forever preferred_lft forever

[root@50024 ~] $ ip r show table 1
[root@50024 ~] $

IPv6 works normally.

config interface 'n_1'
        option force_link '1'
        option stp '1'
        option bridge_empty '1'
        option proto 'dhcp'
        #option macaddr '10:fe:ed:3b:29:90'
        list ifname 'eth0.1'
        list ifname 'eth0.2'
        list ifname 'bat0'
        option type 'bridge'
        option ip4table '1'
        option ip6table '1'

Times are wonky at boot. Power up was at 21:17:something.
Note the changing IP and the bind(UDP): Address not available.

[root@50024 ~] $ logread | grep udhcp
Fri May  1 20:33:30 2020 daemon.notice netifd: n_1 (1639): udhcpc: started, v1.31.1
Fri May  1 20:33:36 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending discover
Fri May  1 20:33:36 2020 daemon.notice netifd: n_1 (1639): udhcpc: performing DHCP renew
Fri May  1 20:33:36 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending discover
Fri May  1 20:33:36 2020 daemon.notice netifd: n_1 (1639): udhcpc: performing DHCP renew
Fri May  1 20:33:36 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending discover
Fri May  1 20:33:39 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending select for 192.168.0.26
Fri May  1 20:33:39 2020 daemon.notice netifd: n_1 (1639): udhcpc: lease of 192.168.0.26 obtained, lease time 864000
Fri May  1 21:18:58 2020 daemon.notice netifd: n_1 (1639): udhcpc: performing DHCP renew
Fri May  1 21:18:58 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending renew to 192.168.0.1
Fri May  1 21:18:58 2020 daemon.notice netifd: n_1 (1639): udhcpc: performing DHCP renew
Fri May  1 21:18:58 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending select for 192.168.0.79
Fri May  1 21:18:58 2020 daemon.notice netifd: n_1 (1639): udhcpc: lease of 192.168.0.79 obtained, lease time 864000
Fri May  1 21:19:05 2020 daemon.notice netifd: n_1 (1639): udhcpc: performing DHCP renew
Fri May  1 21:19:05 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending renew to 192.168.0.1
Fri May  1 21:19:05 2020 daemon.notice netifd: n_1 (1639): udhcpc: bind(UDP): Address not available
Fri May  1 21:19:05 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending renew to 0.0.0.0
Fri May  1 21:19:05 2020 daemon.notice netifd: n_1 (1639): udhcpc: lease of 192.168.0.79 obtained, lease time 864000
Fri May  1 21:19:06 2020 daemon.notice netifd: n_1 (1639): udhcpc: performing DHCP renew
Fri May  1 21:19:06 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending renew to 192.168.0.1
Fri May  1 21:19:06 2020 daemon.notice netifd: n_1 (1639): udhcpc: bind(UDP): Address not available
Fri May  1 21:19:06 2020 daemon.notice netifd: n_1 (1639): udhcpc: sending renew to 0.0.0.0
Fri May  1 21:19:06 2020 daemon.notice netifd: n_1 (1639): udhcpc: lease of 192.168.0.79 obtained, lease time 864000

This is the output of echo "$INTERFACE $@ $ip" ; ip l show "br-$INTERFACE" ; ip -4 a show "br-$INTERFACE" in /etc/udhcpc.user.
Note that the new address (192.168.0.79) isn't added to the interface:

n_1 deconfig
5: br-n_1: <NO-CARRIER,BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 10:fe:ed:3b:29:90 brd ff:ff:ff:ff:ff:ff
n_1 bound 192.168.0.26
5: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 10:fe:ed:3b:29:90 brd ff:ff:ff:ff:ff:ff
5: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 192.168.0.26/24 brd 192.168.0.255 scope global br-n_1
       valid_lft forever preferred_lft forever
n_1 bound 192.168.0.79
5: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 02:fe:ed:3b:29:8f brd ff:ff:ff:ff:ff:ff
n_1 renew 192.168.0.79
5: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 02:fe:ed:3b:29:8f brd ff:ff:ff:ff:ff:ff
n_1 renew 192.168.0.79
5: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 02:fe:ed:3b:29:8f brd ff:ff:ff:ff:ff:ff

The subsequent calls proto_send_update seem to have but one effect: remove the first address from the interface.
However, when the lease expires and it's refreshed, the address is configured correctly.

Fixing the macaddr using uci (see config in previous post) works around the issue.

It appears that some addresses not used by bridge slaves don't have any effect at all.
These are some working examples:
10:fe:ed:3b:29:8f (phy0)
10:fe:ed:3b:29:90 (phy1/lan)
10:fe:ed:3b:29:91 (wan)
02:fe:ed:3b:29:8f (ap)

As for choosing other addresses:
10:fe:ed:3b:29:84 works as well
13:fe:ed:3b:29:90 behaves like the non-existing option

With the right mac address, the ip address / route work as expected:

There's no bind(UDP): Address not available in udhcpc log:

Sat May  2 06:43:10 2020 daemon.notice netifd: n_1 (4690): udhcpc: started, v1.31.1
Sat May  2 06:43:13 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending discover
Sat May  2 06:43:13 2020 daemon.notice netifd: n_1 (4690): udhcpc: performing DHCP renew
Sat May  2 06:43:13 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending discover
Sat May  2 06:43:13 2020 daemon.notice netifd: n_1 (4690): udhcpc: performing DHCP renew
Sat May  2 06:43:13 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending discover
Sat May  2 06:43:16 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending select for 192.168.0.26
Sat May  2 06:43:16 2020 daemon.notice netifd: n_1 (4690): udhcpc: lease of 192.168.0.26 obtained, lease time 864000
Sat May  2 06:43:38 2020 daemon.notice netifd: n_1 (4690): udhcpc: performing DHCP renew
Sat May  2 06:43:38 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending renew to 192.168.0.1
Sat May  2 06:43:38 2020 daemon.notice netifd: n_1 (4690): udhcpc: lease of 192.168.0.26 obtained, lease time 864000
Sat May  2 06:43:40 2020 daemon.notice netifd: n_1 (4690): udhcpc: performing DHCP renew
Sat May  2 06:43:40 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending renew to 192.168.0.1
Sat May  2 06:43:42 2020 daemon.notice netifd: n_1 (4690): udhcpc: performing DHCP renew
Sat May  2 06:43:42 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending select for 192.168.0.26
Sat May  2 06:43:42 2020 daemon.notice netifd: n_1 (4690): udhcpc: lease of 192.168.0.26 obtained, lease time 864000
Sat May  2 06:43:42 2020 daemon.notice netifd: n_1 (4690): udhcpc: performing DHCP renew
Sat May  2 06:43:42 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending renew to 192.168.0.1
Sat May  2 06:44:12 2020 daemon.notice netifd: n_1 (4690): udhcpc: sending renew to 0.0.0.0
Sat May  2 06:44:12 2020 daemon.notice netifd: n_1 (4690): udhcpc: lease of 192.168.0.26 obtained, lease time 864000

Again, from /etc/udhcpc.user:

n_1 deconfig
5: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 02:fe:ed:3b:29:8f brd ff:ff:ff:ff:ff:ff
n_1 deconfig
308: br-n_1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 10:fe:ed:3b:29:90 brd ff:ff:ff:ff:ff:ff
n_1 bound 192.168.0.26
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 10:fe:ed:3b:29:90 brd ff:ff:ff:ff:ff:ff
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 192.168.0.26/24 brd 192.168.0.255 scope global br-n_1
       valid_lft forever preferred_lft forever
n_1 renew 192.168.0.26
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 10:fe:ed:3b:29:90 brd ff:ff:ff:ff:ff:ff
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 192.168.0.26/24 brd 192.168.0.255 scope global br-n_1
       valid_lft forever preferred_lft forever
n_1 bound 192.168.0.26
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 10:fe:ed:3b:29:90 brd ff:ff:ff:ff:ff:ff
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 192.168.0.26/24 brd 192.168.0.255 scope global br-n_1
       valid_lft forever preferred_lft forever
n_1 renew 192.168.0.26
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 10:fe:ed:3b:29:90 brd ff:ff:ff:ff:ff:ff
308: br-n_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 192.168.0.26/24 brd 192.168.0.255 scope global br-n_1
       valid_lft forever preferred_lft forever

Let me know if you want me to test something.

Ummmm...then maybe you should just stop telling OpenWrt to stand up an empty bridge?

:thinking:

The mac address used during the first request is from the ethernet interfaces, so the bridge isn't empty. When the access points from /etc/config/wireless are added and the MAC changes during DHCP, the error occurs.
It's a race condition.

I've found that the bridge_empty option helps to reproduce it.
Without it, it's less likely but still possible. I'm managing a lot of routers. Even if the probability for this happening on one router is very small, it does happen given enough routers.

Here's logread | grep n_1 | grep -v propagat without empty_bridge 1.
The error still occurs.

Sat May  2 07:55:02 2020 daemon.notice netifd: bridge 'br-n_1' link is up
Sat May  2 07:55:02 2020 daemon.notice netifd: Interface 'n_1' has link connectivity
Sat May  2 07:55:02 2020 kern.info kernel: [38241.270399] br-n_1: port 2(eth0.2) entered forwarding state
Sat May  2 07:55:03 2020 kern.info kernel: [38242.006415] br-n_1: port 3(bat0) entered forwarding state
Sat May  2 07:55:03 2020 daemon.notice netifd: n_1 (17288): udhcpc: sending select for 192.168.0.26
Sat May  2 07:55:03 2020 daemon.notice netifd: n_1 (17288): udhcpc: lease of 192.168.0.26 obtained, lease time 864000
Sat May  2 07:55:03 2020 daemon.notice netifd: Interface '6n_1' is enabled
Sat May  2 07:55:03 2020 daemon.notice netifd: Network alias 'br-n_1' link is up
Sat May  2 07:55:03 2020 daemon.notice netifd: Interface '6n_1' has link connectivity
Sat May  2 07:55:03 2020 daemon.notice netifd: Interface '6n_1' is setting up now
Sat May  2 07:55:03 2020 daemon.notice netifd: Interface 'n_1' is now up
Sat May  2 07:55:07 2020 daemon.notice netifd: Interface '6n_1' is now up
Sat May  2 07:55:17 2020 kern.info kernel: [38256.173021] br-n_1: port 4(nw0-1) entered blocking state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.178624] br-n_1: port 4(nw0-1) entered disabled state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.189645] br-n_1: port 4(nw0-1) entered blocking state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.195112] br-n_1: port 4(nw0-1) entered listening state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.512601] br-n_1: port 5(w0-8ed) entered blocking state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.518200] br-n_1: port 5(w0-8ed) entered disabled state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.529273] br-n_1: port 5(w0-8ed) entered blocking state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.534813] br-n_1: port 5(w0-8ed) entered listening state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.902683] br-n_1: port 6(w0-9) entered blocking state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.908136] br-n_1: port 6(w0-9) entered disabled state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.924554] br-n_1: port 6(w0-9) entered blocking state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.929886] br-n_1: port 6(w0-9) entered listening state
Sat May  2 07:55:17 2020 kern.info kernel: [38256.935534] br-n_1: port 1(eth0.1) received tcn bpdu
Sat May  2 07:55:18 2020 daemon.notice netifd: n_1 (17288): udhcpc: performing DHCP renew
Sat May  2 07:55:18 2020 daemon.notice netifd: n_1 (17288): udhcpc: sending renew to 192.168.0.1
Sat May  2 07:55:18 2020 daemon.notice netifd: n_1 (17288): udhcpc: performing DHCP renew
Sat May  2 07:55:18 2020 daemon.notice netifd: n_1 (17288): udhcpc: sending select for 192.168.0.79
Sat May  2 07:55:18 2020 daemon.notice netifd: n_1 (17288): udhcpc: lease of 192.168.0.79 obtained, lease time 864000
Sat May  2 07:55:19 2020 kern.info kernel: [38258.225491] br-n_1: port 4(nw0-1) entered learning state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.545347] br-n_1: port 5(w0-8ed) entered learning state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.556074] br-n_1: port 7(nw1-1) entered blocking state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.561602] br-n_1: port 7(nw1-1) entered disabled state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.572592] br-n_1: port 7(nw1-1) entered blocking state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.578057] br-n_1: port 7(nw1-1) entered listening state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.850566] br-n_1: port 8(w1-8ed) entered blocking state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.856168] br-n_1: port 8(w1-8ed) entered disabled state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.867407] br-n_1: port 8(w1-8ed) entered blocking state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.872972] br-n_1: port 8(w1-8ed) entered listening state
Sat May  2 07:55:19 2020 kern.info kernel: [38258.961137] br-n_1: port 6(w0-9) entered learning state
Sat May  2 07:55:20 2020 kern.info kernel: [38259.180333] br-n_1: port 9(w1-9) entered blocking state
Sat May  2 07:55:20 2020 kern.info kernel: [38259.185753] br-n_1: port 9(w1-9) entered disabled state
Sat May  2 07:55:20 2020 kern.info kernel: [38259.196545] br-n_1: port 9(w1-9) entered blocking state
Sat May  2 07:55:20 2020 kern.info kernel: [38259.201913] br-n_1: port 9(w1-9) entered listening state
Sat May  2 07:55:20 2020 daemon.notice netifd: n_1 (17288): udhcpc: performing DHCP renew
Sat May  2 07:55:20 2020 daemon.notice netifd: n_1 (17288): udhcpc: sending renew to 192.168.0.1
Sat May  2 07:55:20 2020 daemon.notice netifd: n_1 (17288): udhcpc: bind(UDP): Address not available
Sat May  2 07:55:20 2020 daemon.notice netifd: n_1 (17288): udhcpc: sending renew to 0.0.0.0
Sat May  2 07:55:20 2020 daemon.notice netifd: n_1 (17288): udhcpc: lease of 192.168.0.79 obtained, lease time 864000
Sat May  2 07:55:21 2020 kern.info kernel: [38260.240546] br-n_1: port 4(nw0-1) entered forwarding state
Sat May  2 07:55:21 2020 kern.info kernel: [38260.560385] br-n_1: port 5(w0-8ed) entered forwarding state
Sat May  2 07:55:21 2020 kern.info kernel: [38260.592365] br-n_1: port 7(nw1-1) entered learning state
Sat May  2 07:55:21 2020 kern.info kernel: [38260.880230] br-n_1: port 8(w1-8ed) entered learning state
Sat May  2 07:55:21 2020 kern.info kernel: [38260.976234] br-n_1: port 6(w0-9) entered forwarding state
Sat May  2 07:55:22 2020 daemon.notice netifd: n_1 (17288): udhcpc: performing DHCP renew
Sat May  2 07:55:22 2020 daemon.notice netifd: n_1 (17288): udhcpc: sending renew to 192.168.0.1
Sat May  2 07:55:22 2020 daemon.notice netifd: n_1 (17288): udhcpc: bind(UDP): Address not available
Sat May  2 07:55:22 2020 daemon.notice netifd: n_1 (17288): udhcpc: sending renew to 0.0.0.0
Sat May  2 07:55:22 2020 daemon.notice netifd: n_1 (17288): udhcpc: lease of 192.168.0.79 obtained, lease time 864000
Sat May  2 07:55:22 2020 kern.info kernel: [38261.232183] br-n_1: port 9(w1-9) entered learning state
Sat May  2 07:55:23 2020 kern.info kernel: [38262.608948] br-n_1: port 7(nw1-1) entered forwarding state
Sat May  2 07:55:23 2020 kern.info kernel: [38262.895341] br-n_1: port 8(w1-8ed) entered forwarding state
Sat May  2 07:55:24 2020 kern.info kernel: [38263.247199] br-n_1: port 9(w1-9) entered forwarding state

Please show your entire network config file.

sure!

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd53:530b:7959::/48'

config device 'wan_eth0_2_dev'
	option name 'eth0.2'
	option macaddr '10:fe:ed:3b:29:91'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '2 3 4 5 0t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '1 0t'

config interface 'bat0'
	option proto 'batadv'
	option orig_interval '3000'
	option distributed_arp_table '0'

config interface 'n_1'
	option force_link '1'
	option stp '1'
	option bridge_empty '1'
	option proto 'dhcp'
	list ifname 'eth0.1'
	list ifname 'eth0.2'
	list ifname 'bat0'
	option type 'bridge'
	option ip4table '1'
	option ip6table '1'

config interface '6n_1'
	option proto 'dhcpv6'
	option ifname '@n_1'

config rule 'n_1_rule'
	option in 'n_1'
	option lookup '1'
	option priority '1'

config interface 'n_f30'
	option proto 'uce'
	option group 'eu-central'
	option uplink_network '@n_1'
	option ip4table '3'
	option ip6table '3'

config rule 'n_f30_rule'
	option in 'n_f30'
	option lookup '3'
	option priority '3'

config interface 'n_f53'
	option proto 'leggin'
	option host 'rock64-vpn.fritz.box'
	option port '443'
	option no_ssl 'true'
	option uplink_network '@n_1'
	option ip4table '4'
	option ip6table '4'

config rule 'n_f53_rule'
	option in 'n_f53'
	option lookup '4'
	option priority '4'

config interface 'mesh0'
	option mtu '1540'
	option proto 'batadv_hardif'
	option master 'bat0'

config interface 'mesh1'
	option mtu '1540'
	option proto 'batadv_hardif'
	option master 'bat0'