Wifi mesh with TP Link TGR1900 google onhub

Hi Everyone, I am really excited to be here and use OpenWRT to setup my home network.
I have multiple TPLink Google Onhub devices and have been able to flash them with OpenWRT and they come up and let me ssh.
My goal now is to:

  1. Make one of them as the router connected to ISP modem.
  2. I want to a couple more as Wifi AP. They can only connect to the router for internet wirelessly. (I have no ethernet cabling and don't intend to install any).

From my reading what I need to do is:

  1. Base router node needs to have wireless enabled and also the dhcp server enabled. I also need to add a 802.11s on it and create Mesh.
  2. The other two need to have similar Wifi config but DHCP needs to be disabled. Also I have assigned static IPs to them like 192.168.1.50/192.168.1.51 so that I can ssh into them later.
  3. For making Mesh networking to work, I have tried:
    • remove wpad-basic-mbedtls and install wpad-openssl
    • install mesh11sd
    • remove -ct variants of the firware and kmod packages and install the non ct ones.

After doing all of that I have limited success(works after reboot but pretty much fails very soon) and basically an unusable setup. Problems that I am seeing are:

  1. After some uptime I cannot connect to 192.168.1.1(base router address) to see the luci interface. Neither can I ssh.
  2. I pretty much always loose internet access after 5-10 minutes.
  3. Even if I reboot the routers I cannot sometimes get them to work(internet/luci).
  4. When I was able to see luci, I saw that the mesh setup was doing it's thing and atleast 2 of them seem to have associations.

I have not yet tried to use the stock node for a long time but it seemed to work for a couple of hours before I went into this setup.
Images I am using:
https://firmware-selector.openwrt.org/?version=23.05.3&target=ipq806x%2Fchromium&id=tplink_onhub

I don't have very detailed networking knowledge or the knowhow as to how to debug this.
Any pointers are helpful!

Thanks

Yes there is a bug in the current version of mesh11sd in that it defaults to auto-config.

The new, bug-fixed, version v4.0.0 is almost ready for release. It might take a week or so to turn up in the OpenWrt feeds, but if you are willing you can try the beta version that has just come to the end of its testing....

Let me know if you want to try it and I will post a link to github for the installable .ipk

I can definitely try the beta version. Another thing that I forgot to mention is that I also enabled 802.11r(roaming when setting up wifi).

Also, for trying out anything.. I need ssh to work which is also not working at the moment even after reboot.

Waiting for the ipk link whenever u can send. Thanks

You need to reply to me rather than just post otherwise I will not get a notification.

Did you get ssh working? It will be essential.

I have ssh after doing a reset. I have script that I tried 3.1.1 with and that did not work. I can try 4.0.0 now with the same script if I have the ipk.

In the time I can ssh and looks at logs I can see this error message constantly:

Wed May 15 04:32:02 2024 daemon.info mesh11sd[1974]: option auto_mesh_network [ lan ]
Wed May 15 04:32:02 2024 daemon.info mesh11sd[1974]: option auto_mesh_network [ lan ]
Wed May 15 04:32:02 2024 daemon.info mesh11sd[1974]: option mesh_basename [ m-11s- ]
Wed May 15 04:32:02 2024 daemon.info mesh11sd[1974]: option mesh_gate_enable [ 1 ]
Wed May 15 04:32:02 2024 daemon.info mesh11sd[1974]: option ssid_suffix_enable [ 0 ]
Wed May 15 04:32:12 2024 daemon.debug mesh11sd[1974]: interface phy2-mesh0 is up
Wed May 15 04:32:12 2024 daemon.err mesh11sd[1974]: Error getting mesh interface name
Wed May 15 04:32:13 2024 daemon.debug mesh11sd[1974]: checkinterval 10 seconds
Wed May 15 04:32:23 2024 daemon.debug mesh11sd[1974]: interface phy2-mesh0 is up
Wed May 15 04:32:23 2024 daemon.err mesh11sd[1974]: Error getting mesh interface name
Wed May 15 04:32:23 2024 daemon.debug mesh11sd[1974]: checkinterval 10 seconds
Wed May 15 04:32:33 2024 daemon.debug mesh11sd[1974]: interface phy2-mesh0 is up
Wed May 15 04:32:34 2024 daemon.err mesh11sd[1974]: Error getting mesh interface name
Wed May 15 04:32:34 2024 daemon.debug mesh11sd[1974]: checkinterval 10 seconds
Wed May 15 04:32:44 2024 daemon.debug mesh11sd[1974]: interface phy2-mesh0 is up
Wed May 15 04:32:44 2024 daemon.err mesh11sd[1974]: Error getting mesh interface name
Wed May 15 04:32:45 2024 daemon.debug mesh11sd[1974]: checkinterval 10 seconds
Wed May 15 04:32:55 2024 daemon.debug mesh11sd[1974]: interface phy2-mesh0 is up
Wed May 15 04:32:55 2024 daemon.err mesh11sd[1974]: Error getting mesh interface name
Wed May 15 04:32:55 2024 daemon.debug mesh11sd[1974]: checkinterval 10 seconds

My config files:
/etc/config/wireless

config wifi-iface 'mesh0'
        option device 'radio2'
        option mode 'mesh'
        option encryption 'sae'
        option key 'redacted'
        option network 'lan'
        option mesh_id 'redacted'
        option disabled '0'

/etc/config/mesh11sd

config mesh11sd 'mesh_params'
        option mesh_rssi_threshold '-80'
        option mesh_fwding '1'

config mesh11sd 'setup'
        option ssid_suffix_enable '0'
        option auto_config '0'
        option debuglevel '3'

I also see this error in dmesg:

[   79.922043] ath10k_pci 0001:01:00.0: pdev param 0 not supported by firmware

How do I find out what params are supported by firmware ?

Sorry for the delay, it looks like we have a big time zone difference :smiley:

I will upload the latest beta ipk on github and post the link here.

You can prepare for testing by reverting back to the default firmware:

  1. ct drivers replaced with non-ct
  2. kmod-basic-mbedtls replaced by kmod-mesh-mbedtls (at least for testing purposes use the mbedtls version)
  3. install kmod-nft-bridge
  4. DO NOT INSTALL mesh11sd from OpenWrt feeds
  5. DO NOT do any wireless configuration either command line or Luci (we want to have as basic as possible for testing) This will leave all wireless disabled as is normal for OpenWrt when first reflashed.
  6. Reflash all your TGR1900's in exactly the same way, making sure that each one in turn boots up in the same way. YES - all have the EXACT same config - Because mesh11sd will autoconfig for you. (you can do as much custom config as you like once we have finished basic testing).
  7. Wait for me to upload the ipk :wink:

DO NOT USE YOUR SCRIPT!

Make sure you have reverted to basics as I described in my last post.
Get the ipk by doing the following:

  1. Connect the first tgr1900's wan port to a lan port on your isp router.
  2. With your computer connected to the tgr1900 lan port, ssh into it.
  3. Do the following commands:
cd /tmp
wget https://github.com/openNDS/mesh11sd/blob/master/community/mesh11sd_4.0.0-beta-13_all.ipk
opkg install mesh11sd_4.0.0-beta-13_all.ipk; logread -f

You can then watch the syslog output and should see mesh11sd starting.
You can of course stop the output by doing ctrl/c.

Note: You will not be able to see the autoconfig results in Luci as it is done dynamically every time mesh11sd starts. It is not even stored in /etc/config/wireless.
You can view it by doing:
uci show wireless

You can stop mesh11sd by doing:
service mesh11sd stop
and start it again with:
service mesh11sd start

With it stopped, do the following:

uci set mesh11sd.setup.auto_config='1'
uci set mesh11sd.setup.debuglevel='3'
uci commit mesh11sd

Now start it up again:
service mesh11sd start; logread -f

If all looks good we can test a wireless connection.

Try connecting your phone via wifi.
The default ssid will be "OpenWrt-2g-xxxx" with no encryption yet (that will be one of the next things to do if we get this far)

Once we have done a couple more things, we will get on to adding another tgr1900 to the mesh.

EDIT: Sorry I made a few edits in there :wink:

Can it be tested in Xiaomi ax3600, or the update IS device dependient?

I still in 3.1.1

It is architecture independent so will run on any device.

1 Like

Thanks for your help!
Both the tests seem to work and I can connect to Openwrt-5g-xxxx and internet works.

fwiw I had also gotten my mesh to work with 3.1.1 with auto_config disabled and the below configs:
wireless

config wifi-iface 'mesh0'
        option device 'radio2'
        option mode 'mesh'
        option encryption 'sae'
        option key 'redacted'
        option network 'lan'
        option mesh_id 'redacted'
        option disabled '0'

mesh

config mesh11sd 'setup'
        option ssid_suffix_enable '0'
        option auto_config '0'
        option debuglevel '3'

But in the above config then I tried to set auto_config to '1' and I just bricked all my nodes and had to reset/reinstall everything to get them back up.

Now, my goal is:

  1. use auto_config for the mesh. I see that by default it is using encryption so that is good enough for me.
  2. I saw in uci show wireless that it has created a mesh iface on all the three radios on my nodes(tgr1900 has 3!) which I personally do not know if it's overkill and will use more power and actually cause other traffic to get laggy or anything?
  3. I want to add ssid_suffix_enable=0 for sure as I need my SSIDs for the wifi to be the same on all nodes.
  4. I want to configure SSID on all nodes for wifi access.
  5. I also want to create a guest SSID on all nodes which is isolated from lan but connected to internet. I don't know if that network will be able to connect to internet using the mesh that is just created ? or something else is required ?
  6. and I really want to be able to access luci and ssh into all nodes in the mesh. I don't know if I can do that without setting a well know ip address or not ? until now I was doing that but that does not work with auto_config enabled.

You said initially you had 3 tgr1900's.
Have you configured the other two using exactly the same process as the first one?

Power down the first, then do the second.
Power down the second, then do the third.

Finally connect one of them, its lan to isp wan, just like when you flashed it.

Power up the second with no ethernet connection in its desired location.

Do the same with the third.

Go back to the first and ssh into it and do logread -f
You should see the other two connected, although it may take a few minutes.

This is only on the mesh interface. You need to configure access point encryption.
Yes, you can turn off the ssdi suffix.

Yes, but only one is enabled, by default it is the 2GHz radio. This is the default because generally it gives better range and a more stable mesh. 5GHz has a shorter range and can be subject to DFS interruptions. But it is up to you if you want to change it.

That is difficult using a wireless link of any kind and requires some form of tunnel setup. It can be done, but is out of the scope of this thread.

Yes this is easy. Nothing special needs to be set up. We can go into this once you actually have a mesh running.

Let me know when the mesh is running, or if you have problems with it. Then we can start customising.

The mesh is running!

I am seeing this in log messages

gw_ip=192.168.0.1..
Wed May 15 20:34:12 2024 daemon.debug mesh11sd[1855]: Leaving check portal....
Wed May 15 20:34:13 2024 daemon.info mesh11sd[1855]: Path to station [ ] is stable
Wed May 15 20:34:13 2024 daemon.warn mesh11sd[1855]: Warning! Station [  ] is an immediate neighbour, but has had [ 5 ] path_change(s) detected
Wed May 15 20:34:13 2024 daemon.warn mesh11sd[1855]: Warning! Station [  ], consider its location or adjust txpower and/or rssi_threshold
Wed May 15 20:34:13 2024 daemon.debug mesh11sd[1855]: checkinterval 10 seconds

does that mean that I have too many stations and don't need 3 ? 2 are enough ?

Also now what's the best way to disable ssid_suffix_enable after the auto_config has run at boot ?

Also what about accessing luci/ssh ? I wanted to do some SSID configuration and make all the nodes advertise encrypted networks with same SSIDs and also enable 80211r on all.

I tried to play with wifi configuration by disabling the 2ghz meshpoint and enabling the 5ghz one configured for radio2 via luci. I also renamed the 2/5ghz non mesh wifi to a name I like. I have an extra radio and I want to use that for the mesh rather than the default radio0. Once I did that and rebooted I am getting nowhere and I can't even ssh into any node. Even if I try to run only one node at a time, I still can't access it over ethernet or wifi.
Is renaming/disabling interfering with the mesh11sd setup ? does it depend on what is enabled or disabled and the auto_config mode different things ?

Your enthusiasm is good but timezone induced impatience is not so much :wink:

You will have to try to get back to where you were when waiting for my reply.

Note: I am trying to take you though it step at a time, but the best way is to use the Firmware selector to do it all for you. We can go back to that later, unless you want to jump straight to the final config without any insight into how it all fits together.

I agree. I want to get this done with sooner rather than later :smile:
Give me 15 minutes I will reset all the nodes and get them working with zero config.

Possibly. But there are numerous things that can be done to stabilise the link.

We really need to think in 3 dimensions, but some quick 1 dimensional diagrams might help.

Lets say we have node1, node2 and node3.

First we have just node1 and node2 in different rooms:

node1 .................... node2

Both boot up and join the mesh, all is fine. Both nodes can see each other and communicate on the mesh.

Now we add node3, putting it between node1 and node2:

node1 .......... node3 ......... node2

Node3 can see both node1 and node2 and the signal strength is high because all 3 are within each other's coverage area.

node1 can get to node2 directly but also indirectly via node3.
Now this is fine in a simple case, but remember we are talking about microwave propagation and microwaves can be reflected off all sorts of things (that is how radar works).
Even very small changes in the myriad of reflections in a building can have a large effect, resulting in the mesh seeing each potential path from node1 to node2 as being the best at any instant. This leads to hop count changes. This is fine for layer 2 packets, but is potentially a disaster for layer 3 TCP packets.

mesh11sd has built in tools we can use for optimising the infrastucture of the mesh backhaul.

Thanks for explaining it here. And that's a good enough reason to just let mesh11sd configure what It can automatically configure. And I would rather not have any manual config but wifi SSID names are something I need.

btw.. I have the mesh going again with vanilla everything except auto_config=1 and ssid rename off:

Thu May 16 05:57:52 2024 daemon.debug mesh11sd[1859]: Leaving check portal....
Thu May 16 05:57:53 2024 daemon.info mesh11sd[1859]: Path to station [  ] is stable
Thu May 16 05:57:53 2024 daemon.info mesh11sd[1859]: Path to station [  ] is stable
Thu May 16 05:57:53 2024 daemon.debug mesh11sd[1859]: checkinterval 10 seconds

so what is the next step ?

Yes we can do that. But:

  1. 2.4GHz gives the best mesh coverage/penetration (but is not necessarily the fastest). mesh11sd uses 2.4GHz by default.
  2. You can still use 2.4GHz for users, as well as the mesh. Yes there are some compromises, but overall, at least for normal domestic use, this is the best compromise.

We should stay with 2.4GHz mesh for now, while we are testing and try 5GHz later as a comparison, then you can decide which to use in your use case.

1 Like