Buttonless Failsafe Mode

slh · February 3, 2023, 12:50am

Interestingly it isn't for me (no washington post account, incognito window, but no US based IP either), just found with a google search…

https://www.pogo.org/analysis/2018/06/why-do-air-force-planes-need-10000-toilet-seat-covers or https://www.military.com/defensetech/2018/07/11/air-force-no-longer-spending-10000-toilet-seats-officials-say.html would be similar news stories.

--
Actually I was looking for a movie quote that was going a bit like "you don't think the air force would pay xxx dollars for a toilet seat" or similar (to stay in line with your line of visual aids), might have been Independence Day, Air America or something rather similar (as a hitch for shadow cross-financing CIA like black-ops), but my motivation to identify the original source faded, after finding corresponding real news articles about it.

EDIT: And indeed, thanks:

Lynx · February 3, 2023, 3:26am

From a theoretical perspective is having an IP compatibility a fundamental requirement to communicating a signal?

I'm just trying to think of how node A could communicate a detectable pattern to node B without a good IP configuration therebetween.

What about some form of broadcast?

Is NO-CARRIER reflective of power status such that it's not like a sequence can be generated that way whilst maintaining power? Because otherwise a pattern could be communicated with cycling that in a particular way, right?

Or can transmission rate be set between 1000 and 100 and flip flop therebetween in a specific sequence?

Or how about six unique magic packets:

Or combination of the above?

I don't know enough about ethernet/networking to come up with a strategy but surely there is a way here.

psherman · February 3, 2023, 3:48am

No, an IP is not absolutely required. This can theoretically be done at L2 like WOL.

But there are logistical issues here... you wouldn't want this to be possible to engage from the WAN. But depending on the device, the MAC addresses may not be unique between the LAN and WAN... or maybe the WAN is being used as 'just another port' on the switch (which can be done at a higher level within OpenWrt by remapping ports on the switch or bridging ports). Or worse, for multi-WAN configurations where LAN ports are reassigned to serve as WANs, now the MAC that might have nominally been assigned to the LAN is now actually exposed to the WAN. Also, what do you do with a device that has only a single ethernet port (such as an AP or travel router)? So it could be hard to guarantee this method remain secure in all cases and also that it would always work.

Again, this could be triggered accidentally. Physical power issues as described earlier could easily falsely trigger this. So could plugging/unplugging devices (or power cycling/rebooting those devices, or even just changing configurations such that the ethernet port bounces). I even had a physical wiring issue that caused a whole lot of port flapping until I identified and fixed the issue (bad termination).... all of these things could very easily cause a false positive.

Lynx · February 3, 2023, 3:53am

Go on, give us a proposal.

psherman · February 3, 2023, 3:54am

It's your idea... I don't want to steal your thunder.

Lynx · February 3, 2023, 3:56am

I don't care about thunder. Why not give it a shot.

Give it your very best proposal and then see if it withstands your own scrutiny. Wouldn't that be a nice idea rather than just finding faults with or shooting down all of my suggestions?

I'm sure there is a way here.

How about magic packet sequence after specific time from switching on?

psherman · February 3, 2023, 4:06am

But it's not my idea... and I don't have any interest in putting effort into concocting a method for what I kind of see as a solution looking for a problem.

I'm telling you why I don't think these things would work based on my knowledge of networking. If you told me that you wanted to use a plane with a propeller or jet engine to fly to the moon, I'd tell you that there's no air in space, so you need to come up with another method of propulsion. I might not be able to tell you how to build the rocket engine, but I could tell you that it needs to carry its own fuel and oxidizer to be able to burn since there's no oxygen up there to support combustion.

I don't have a solution, but I can tell you that these (so far) are not robust enough for integration into core OpenWrt. There was the "unbrickable" script that was shared earlier.. that seems pretty perfect for your needs, but even that is probably not something that should be included in OpenWrt by default.

If this is something you really want to explore, you should do some R&D around it yourself. If you can find a way that is robust enough for the developers to sign off on the idea and implement it, that would be quite cool.

Making one magic packet or a sequence of them doesn't really matter, as I explained earlier:

Bill · February 3, 2023, 4:10am

RFC 1918 16,777,216
On a HooToo or RouterBOARD just an example of two device by separate manufactures that run in Class A.

MikroTik first load is inframs to get to the jelly of OpenWrt.

Lynx · February 3, 2023, 4:12am

How is this a solution looking for a problem? Device has become inaccessible and no power button or device inaccessible. That's a real problem.

Recreating a scene from back to the future with serial console is hardly convenient, nor is having to climb a ladder.

So I strongly dispute that assertion.

psherman · February 3, 2023, 4:13am

But your device does have a reset button.. it's just difficult to access. I get that -- it's hard to access which makes this a problem you want to solve.

Why is the brickproof option not sufficient for your needs?

Lynx · February 3, 2023, 4:16am

It might be. But I didn't start this thread to find a way that might work for my device. I could write a script for that.

I wrote this thread because had there been an implementation in OpenWrt for a buttonless return to safe mode or whatever I would have really benefited from it. And I'll bet there are many others who would have really wanted this too at one time or another. And have had to break open device and use serial console or climb ladders or whatever too.

It's not a solution looking for a problem. It's a real and existing problem. Maybe as you described earlier a tough nut to crack. But categorically not a solution looking for a problem.

Pooh-pooing is a court marshal offence and destroys morale:

slh · February 3, 2023, 5:02am

Suggests to me that it's indeed practical, as it apparently has been done that way already - and yes, that is from the https://openwrt.org/toh/zyxel/nr7101 device page.

Lynx · February 3, 2023, 5:25am

Thankfully I did not need that and it certainly doesn't look so much like a scene from Back to the Future should I ever need it later on. In my case I could get away with just pressing the reset button having climbed a high ladder.

Whilst the latter may have motivated my feature request, as I tried to convey in the very post you replied to, I did not intend managing my particular device to be the subject of my feature request.

I'm hoping future posts on this thread can be more positive and constructive as some of the posts seem to reflect a mind desirous of misunderstanding or tearing down.

The subject of this thread does not reflect a problem looking for a solution like @psherman suggested in one of his posts above. Facilitating a user-friendly mechanism to regain access when locked out on a buttonless or hard to access device is a very real problem. And it can be a time of despair for inexperienced or new OpenWrt users. Preempting the situation with a custom script is one thing, but that's not helpful when the user is already locked out and hoping for a simple way to reset and regain entry without undue stress and difficulty.

Thanks to you, @psherman and others on this thread I see that there are difficulties in implementing a good technical solution to this problem, but I like to think that a good feature request in this helpful new forum category does not need to be presented together with an optimal solution.

I have tried to give viable options to the best of my limited understanding in this technical field like adopting:

a power cycling sequence; and/or
a magic packet sequence,

and I see there are outstanding challenges, but I don't think these are necessarily insurmountable.

spence · February 3, 2023, 3:54pm

I don't have a potential solution to offer but I do offer a requirement to consider.

Any solution based on network, IP packet or Ethernet frames (magic packet/WOL type) etc., should only be usable by authorized devices. It should not be physically available from any unauthorized system like a compromised device on the wired LAN or wifi.

We don't want malware on another device on our LAN, like malware on a PC, phone or iot device, to DOS our network, or worse, make use of this published feature to compromise our OpenWrt devices.

Lynx · February 3, 2023, 4:42pm

I wondered about that. Perhaps the magic packet could incorporate hashed router password? And by having specific window from powerup when this can be accepted this might also add further protection?

spence · February 3, 2023, 4:53pm

That probably isn't safe enough. A magic packet should not be repeatable. We don't want a compromised system on our LAN to listen for these and then simply be able to replay the packet.
I think a fully encrypted communication channel providing equal security as ssh or https would be needed.

Lynx · February 3, 2023, 4:58pm

Ah yes. I wondered whether that would be feasible within a magic packet context and it seems like something along these lines is the subject of this Intel patent:

Currently, this wake mechanism is insecure. In other words, computing devices or platforms do not sufficiently protect against spurious or malicious wake events. A so-called “sniffer” can monitor the packet sent over the communications network used by the two systems. A malicious person can detect such packets and replay them at a later time. A variation of the wake mechanism is referred to as “magic packet+password”. The “magic packet+password” is similar to a packet of the “magic packet” but includes an additional six-byte password appended to end. While the “magic packet+password” mechanism does have a password, the password is nonetheless sent unencrypted and susceptible to a replay attack in the same manner as the “magic packet”.

Upon waking the necessary hardware, network controller 170 may allow management controller 180 to establish a secure or encrypted communication session between network controller 170 and the remote console.

Albeit our context is trickier because setting up that communication channel is rendered harder (or impossible?) given the lack of connectivity we are attempting to address.

I admit I'm beginning to lose heart a little now.

NPeca75 · February 5, 2023, 10:02am

idea is great, but highly unsecure

what ever approach to choose:

router SEND magic packets to some dst mac/ip/port knock and then bring up ssh daemon for short time on positive reply from other side
router LISTEN for some mac/ip/port knock combo and then bring up ssh daemon on positive match

problem is that there must be some kind of password to access the device

if the password is same for all OpenWRT device, then, it is no password anymore
so, only unique password which could be baked at boot time is something based on device MAC address
and that is far from strong password

CFSworks · March 13, 2023, 10:31pm

I don't think I have ever had a need for such a feature (I have never had devices in hard-to-reach places) but the problem has nonetheless successfully nerd-sniped me. From the thread, I am under the impression that the go/no-go for failsafe mode must happen before the overlayfs is established, and there's no mechanism for rebooting into failsafe mode (as indeed a good option might be to fork a long sleep followed by a reboot-to-failsafe into the background before touching the network config, killing it when done).

I agree fully that the design must permit NO false positives (accidental or malicious) and must not interfere with the rest of the network. But I do not think it is a reasonable or feasible goal to make the software failsafe 100% reliable (as indeed that's what the button is for). The goal here is just to have something else to try before one has to resort to climbing out of one's window. To that end, anything that works in 90-95% of cases should be "good enough" (and is obviously a massive improvement over the 0% we have today).

This is a sketch of a solution, not an actual proposal, but:

On first boot, a failsafe key is randomized and saved to /etc/config/system. The idea is that this gets included in configuration backups, so that the device admin is likely to have it when they end up needing it.
In preinit, before the normal/failsafe decision but after /overlay is mounted(?), the key is parsed out of /etc/config/system and a lightweight "failsafe service" is launched. The boot process waits for this service to run to completion before deciding how to proceed.
The "failsafe service" scans for all Ethernet interfaces and (if available this early) 802.11 interfaces. They are brought up (the latter in monitor mode, on channel 1 or 36 depending on band) in a strict speak-only-when-spoken-to fashion, to avoid interfering with the network.
The service listens for 500ms(?) for magic packets on the wired (with a magic EtherType) and wireless (802.11 Action frames) interfaces. After the timeout, proceed with normal boot.
If one arrives, check that it has a valid HMAC (using the configured key) and see if it is a challenge packet or a failsafe packet.
For a challenge packet, randomize a nonce (IFF not done yet this boot) and transmit it back to the sender. This protects against replay attacks. For a failsafe packet, ensure that it contains a nonce and matches the one that was randomized, then proceed to failsafe boot.

I'm reasonably confident that this (or something like it) satisfies the "do not interfere" and "no false positives" requirements. There is no guarantee that there are no false negatives, as a few things can still go wrong:

Admin may not be able to send the necessary magic packets: either they are remote and not on the same L2 segment, or their computer does not permit sending raw Ethernet frames and/or injecting wireless Action frames
Admin may not actually have any backup of the device's config handy
Device might be unable to initialize EITHER 802.11 or Ethernet interfaces that early if the kernel modules don't load or they both require firmware
Device's 802.11 WNIC might not support frame injection to send the challenge nonce, forcing the admin to use Ethernet
Device might have been soft-bricked in some way that kills this mechanism (the failsafe key was deleted, or the network drivers corrupted, or ...)

I don't see any of these cases being extremely likely, though. I still think the reliability percentage is somewhere in the 90s. And, again, the point is just to have something for folks like @Lynx to try before resorting to riskier things like leaning out the window and/or climbing a ladder. We aren't replacing the hardware button, just augmenting it.

psherman · March 14, 2023, 1:36am

I like the thinking here, but I don't think that it will work...

This key would need to be written to the overlay. No issue writing it there on first boot, but...

The key was written to overlay, so this can't happen until overlay is mounted. But overlay doesn't get mounted until later in the boot cycle -- after the normal/failsafe decision has been made.

Wifi doen't come up until way later in the boot cycle... you need to load a lot of things before you can load the drivers and start a network stack on those interfaces. Ethernet isn't even up at the time of the failsafe decision.

Can't listen if the interfaces aren't up.