Mesh can't establish again after blackout?

I have noticed this issue and I would like to ask if anyone had anything similar to this.

The errors I see in the logread output are:

daemon.notice wpa_supplicant[7398]: nl80211: Failed to set interface into station mode
daemon.err wpa_supplicant[7398]: mesh1: mesh leave error=-134

Usually a reboot solves it, but today it took more than half an hour in debugging and several reboots/wifi reload, I am still not sure what is going on!

This happens every time there is a blackout, maybe it's just some hardware issue?

Not the same but I had the same thing happen when I left channel to auto accidentally on one of my EA8300 nodes. Just that node was affected.

What routers are you using?

In this case it's happening on Winstars WS-WN552K1.
When you wrote auto channel, you mean the radio of the interface which connected to the mesh was on auto?
In that case it's pretty straightforward to know why it would happen, right?
In this case it's set to a specific channel.

BTW the problem fixed itself after a few reboots but it bugs me that I can't understand what's going on.

Maybe I can turn on some more logging or find more info elsewhere? If anyone has suggestions to help debug this it will be useful, thanks.

Yes - but I don't think it was me. I never select auto channel especially for mesh.

It turned out that I have a node that is acting up. It's radio configs change sometimes. Often reverting back to default driver on the country code and default tx power. Not sure why. So far so good - it all works >99%.

When you log in to LuCI are any of them connected typically after power loss? Are they all doing it? Just one?

Stable firmware?

Do a backup for each node. Then restore from backup if there is a problem?

Some of mine still do it even after reflashing the firmware or restoring from backup. But they run well otherwise.

You could crack it open and connect a serial to USB cable to log the serial port activity on a separate computer. I read on the forums that's a good way to get a better picture of what is happening if the system and kernel logs aren't giving you enough info. YMMV.

If the hardware or NAND is starting to get flaky you will know soon enough.

Reflashing the firmware may help map out bad blocks.

Another solution could be to get the nodes on UPS. Throw the modem on there too and you have internet during a power failure.

12V8AH (ups SLA internal battery) = 96Wh x 0.85 inverter efficiency = 81Wh / 5w (typical router) = 16H runtime for 1 router. Connect all three nodes to it plus modem = 4H runtime.

I found out there are many other people experiencing the same issue, there's an issue in Libremesh Packages github repo, I share it here in case somebody is experiencing the same issue and wants to find more information.

I had ignored the MESH-SAE-AUTH-FAILURE error mesage when I saw it and didn't post it in this thread, because it didn't make any sense, but now that I see that more people have the same issue I understand I should have included it, sorry for that!

1 Like