OpenWRT is bridging during booting up

indiemusic21 · October 31, 2023, 9:30am

Hi,

I'm using OpenWRT r23497-6637af95aa on DIR-612/F/EID.
currently, I have a problem when DIR-612 booting up, all of the ethernet port is bridging before the booting process finished and the IP Client will get DHCP from interface WAN, and then the IP Client will be get from DIR-612 after booting up finish.
how do I keep all the ethernet ports on the DIR-612 from going into bridge mode during the boot up process.

Thank you in advance and greetings,
DS

egc · October 31, 2023, 9:37am

It is a known problem on some (older) hardware.

The hardware has just one switch and the OS is creating the VLANs when it is up, but before the OS is up it is just acting as one switch.

Only option I know is to pull out the WAN cable before rebooting

I have an old Linksys E2000 with the same "problem".

indiemusic21 · October 31, 2023, 9:41am

hi egc,

thank you for your respond,
I have 2 questions :

if we use newest device, will it not have this kind of problem?
can we change the system of OpenWRT so that this problem can solved?

egc · October 31, 2023, 9:48am

I have not seen it on newer devices (R7800, EA8500, DL-WRX36) but I cannot guarantee it.

It has nothing to do with the Operating system it happens before the OS is up.

indiemusic21 · October 31, 2023, 9:51am

so this is hardware problem?
and can't be solved by system?

andyboeh · October 31, 2023, 10:49am

It's more a bootloader issue, if and how it initializes the switch chip. But it can't be solved by the system.

If this is a private IP assigned from your ISP modem, you could maybe disable the DHCP server on the ISP modem and assign a static IP to your OpenWrt device.
You will still leak packets, but at least your clients shouldn't get a wrong IP address. If that's the public IP address from a bridged modem, you're probably out of luck.

indiemusic21 · November 1, 2023, 2:24am

hi @andyboeh
thank you for your response, actually I got the IP address from a bridged modem.
if I turn on the DUT without connecting to my laptop it will be normal but we can't do that, if we need to reboot the DUT and modem together it will be problem again because the DUT always connects to the end device.

do you have any suggestions for this issue?
maybe can we fix it in the bootloader? please advise how to solve it.

psherman · November 1, 2023, 3:33am

You'd have to have the source code of the bootloader itself (not part of OpenWrt), and then you'd have to modify it so that it doesn't bring the switch up until it is initialized by the full running OS (or such that it separates the 'wan' port from the rest of the ports on the switch).

This is outside the scope of OpenWrt, and would require considerable low level programming experience to achieve, especially for older hardware.

For another recent discussion of this whole bootloader+hardware switch issue, see this thread.

slh · November 1, 2023, 3:36am

The simple answer is "no".

The pragmatic answer is "no".

The real answer is more complicated, and utterly device specific…

…and at least I don't find much (any) information about a DIR-612/F/EID (typo, or bad searching on my side?).

First of all there are four potential points of contention:

how is the switch hard-strapped
meaning, in what configuration will it come up after power-on, before the SOC/ bootloader initializes it
how does the bootloader configure it/ leave it configured
what is the default state the kernel module leaves it in
how long does it take for netifd to apply the user specified configuration

In all of these, timing matters, how long is (are) the period(s) of all ports being bridged together, is it long enough for the DHCP handshake to succeed (and is that 'encouraged' by link-down events) - and yes, this also kind of depends how quick your client reacts (so not really an absolute number). If you want to really fix this, you have to profile the timing of each of these steps, to evaluate where your pinch points are.

These issues are kind of well-known and with the typical design of one CPU to a single (simple) managed onboard switch hard to mitigate, if not properly accounted for by the vendor - in most cases you can't totally remove the periods of full-bridge mode, only mitigate it by reducing the bridged time down to a period that might not allow a DHCP handshake to succeed (cheating).

Some of these mitigations involve an intermediary loader chainloaded between bootloader and kernel, whose only task is to deactivate all bridges/ all ports (it will not fix the situation, but hopefully reduce the time enough in practice), until the kernel/ netifd brings them up again in a sensible manner. There is precedent for this approach, if it succeeds depends on what the first two bullet points do/ how long it takes from ifup (in bridged mode) to the actual configuration. The 'beauty' of this is, that it's relatively generic (apart from the switch specific support) and relatively safe to do and experiment with, as it never touches the OEM bootloader. If this is sufficient depends all on the exact device specific timing.

If that isn't enough, the only remaining option is indeed to change the bootloader, for this to succeed you need:

the bootloader source
…and hope that the vendor provides you with it, complete and in working order
…and hope that your own compilation (using your toolchain/ configuration) will work
serial console access
the ability to rewrite the whole flash chip (or JTAG level access, the former is easier than JTAG, the later gives you more hardware level insight)
you will need it/ you will break it, soldering on a socketed option will be beneficial
the actual development skills do change the switch bringup sequence to your liking
a lot of hope that the vendor isn't using secure boot (which is becoming increasingly common on ARMv8 based platforms)
the hope that the hard-strapping is in your favour/ respectively the bootloader loads quick enough to meet your requirements

If your environment is sensitive to this issue, it might make more sense to switch hardware to something not exposing this issue by design, e.g. to hardware without an integrated switch but using dedicated interfaces (e.g. x86_64, rockchip, RPi, etc.). This would be the most pragmatic (and safest) way out, as it guarantees that there will be no spillover whatsoever.

andyboeh · November 1, 2023, 9:25am

The only "workaround" that doesn't involve buying a new router would be the usage of a switch for all devices. That way, your devices won't see the link-down / link-up events and will not request a new IP when the router restarts. Thinking about it, that's actually my topology - and I never tested whether my router leaks packets, I suppose so...

That's not a fix, if a client requests an IP in this very moment, it will still get the wrong one. And it will still leak packets.