How to manage a fleet of a 100 routers?

I'm currently trying to manage a fleet of routers 10-100, whereby I collect data from them and then potentially push individual commands to them to perform certain actions.

I started this off by building a REST API in Go which sat on each device and it worked fine for the most part on a few devices. Now I was thinking this (probably?) doesn't scale very well. So naturally I decided to look for another approach which led me to MQTT. For the most part pub/sub works great with aggregating data, so i tried anyways and for getting data from the router it worked wonders. But pushing commands and expecting responses, it became really messy really quick.

So i wanted to know, was I on the right lines in any of these idea? I'm tempted to try GRpc but not sure if that's a bad idea. Or was using either REST or MQTT my best idea yet. (Also, I tried OpenWisp2, really likes it, it's just they don't have anything in Go lol)

Also if this is the wrong place to be asking in, let me know will remove the message to a better place

Managing devices is more or less a solved problem. Before rolling your own solution, maybe you should try to look at configuration managers first like ansible, saltstack, puppet, etc.

Ansible works pretty reasonably on openwrt and gives you some of the capabilities you want.

Saltstack will probably require a proxy to run but will give you a lot of control.

If you are going to roll your own solution MQTT seems fine. RabbitMQ with an MQTT connector would give you a lot of control over messages and events, if you need that functionality.

1 Like

So these would be great for configuration management, but what I'm after might be a little different. Something to be able to collate data from and occasionally run commands (which again returns some data), which is why I thought REST API would be good, hitting an endpoint on any of the routers which would perform some action and return some data.

In OpenWISP we're about to add the possibility to send commands to devices with this pull request (It has a REST API too).

There are additional modules for collecting monitoring information and performing firmware upgrades.

It's not in Go as you noted, but in Python. Why would this be a problem?

Building and maintaining such a system is not trivial, so unless you have really special requirements and you have the enough time or enough budget to build and maintain your own solution, I would suggest sticking to an existing solution, if you don't want to go with OpenWISP you can use tools like influxdb, grafana, prometheus and a bit of ansible to execute commands.

OpenWISP was built more for use cases in which users need to execute simple actions to replicate the same operations repeatedly over a large number of devices, the other tools instead are more generic and suited to those who have good network engineering skills and can build a tailoired solution on their own, which may be your case.