Geoblocking project which may work its way into OpenWRT

Hi there!
I've been working on a geoblocking solution for my own Debian server. It's a suite of bash scripts, still under development, but the features already implemented work well, efficiently and reliably. The code is, of course, open source.

A few days ago I decided to test it on my OpenWRT router and surprisingly, after some tinkering, it worked, just required a few packages to be upgraded to the "fuller" versions, installing bash, and some minor code changes to work around the limitations of other packages.

There are some limitations left which should be relatively easy to fix, detailed here:

(some of them may have already been solved in the newer OpenWRT releases?)

I'm not well acquainted with OpenWRT internals, so I'd like to hear from more experienced users or devs whether this is needed and feasible, and some tips for overcoming the issues (as much detail as possible, please). The description of the project is in the main page:

1 Like


This is how a status report looks currently.

Is there a reason you decided to forgo banip and dig up an old package?

The scripts you've linked rely on iptables whereas OpenWrt switched to nftables for last two stable releases, I wouldn't invest much time into it, unless:

  1. It's different enough from what banip provides
  2. There's an effort to rewrite to nftables
3 Likes

If you need GeoIP filtering, just use banIP:
https://openwrt.org/docs/guide-user/services/banip#blocking_countries

Otherwise, there's already a tested solution that can be combined with custom rules:
https://openwrt.org/docs/guide-user/advanced/ipset_extras

2 Likes

Also, you probably know this already, geo-blocking only works if you have unambiguous IP to country/region mappings; in theory every device is only ever at one location at a time, but IP addresses are re-used, sometimes such re-use is irrelevant (e.g. whether my location gets reported as either in Hamburg or Frankfurt does not matter if the resolution you need is by country, but if you aim for "federal state" this would already be wrong (Hamburg is its own state, and Frankfurt in Hesse), and for the record, these are the places different geoIP provider locate me at, but I actually live in a third federal state.
That said, even incorrect geoIP locations can be useful, just keep in mind that they are approximations, not ground-truths.

I didn't even think of OpenWRT when starting this project. I just wanted to make a geoip solution for my Debian server (and I didn't even hear of BanIp which is OpenWRT specific btw). Accidentally it turned out that because it's Bash, it also (mostly) works on OpenWRT, but this was an after-thought.

That said, I'm not acquainted with BanIp but now looking at its readme, I couldn't figure out how to actually make it do basic geoblocking, or what features it offers for that. Where does it download the ip lists from? Does it validate them? Does it automatically update them? Does it provide a useful overview of geoip blocking configuration to the user? Does it provide an easy and intuitive interface to add or remove country codes? I did see that it's incredibly complex and does a million things. My project, on the other hand, is really easy and painless to figure out and set up, specifically does only one thing, and does it really well (i.e. it automatically updates the lists at a pre-set schedule, goes a long way to make sure that corrupted or incomplete lists don't get applied to the firewall, provides an easy interface to view current geoblocking config, and to add/remove countries). Don't you think an alternative like this may be useful?

The above also addresses the first part of @vgaetera 's comment mentioning BanIp. Regarding ipset_extras, correct me if I'm wrong but it seems that it doesn't provide autoupdate functionality, and it relies on a 3rd party for fetching ip lists (specifically ipdeny.com) (while my project exclusively fetches the lists from the official regional registries), and it doesn't update them, and then again, I'm not sure whether it validates the downloaded lists.

Now I do understand suspicious attitude to someone unknown to the community (although I have written a couple of posts here a few years ago) who offers a code that (if malicious or poorly written) could cause damage. But then - it's open source. Anyone can go through the code and see exactly what it does.

Regarding nftables support, I am planning to add support for it in the near future, which is even mentioned in the first line of the description on GitHub. The functionality is also lacking ipv6 support currently, and that will get implemented as well.

At this point I'm not offering to include this in the distribution. I'm just asking - would it be useful if the abovementioned features got implemented? Would people want it? If yes, I will keep OpenWRT in mind while continuing the development. If no then I won't. (And if yes then I wouldn't mind some tips from experienced OpenWRT devs about integrating this project with OpenWRT)

My project only works at a country resolution. That said, even at this resolution the ip lists are not a static target: they get updated regularly by the registries. That's why I implemented (optional, enabled by default) automatic updates of the ip lists.

Well, these still are imprecise, and you essentially trust some list you download to make important decisions for you. Can well be a useful tool, but users should be aware of the potential side-effects.

I may be wrong here but as far as I know, the official regional Internet registries are the 1st party in the matter of assigning scopes of ip addresses to bodies in each country. I.e. they do the assignment, and that gets propagated down to everyone else using the Internet. So I'm not sure how this could be imprecise. That's part of the reason why I decided to use them and not a 3rd party (like ipdeny dot com) as the source for the ip lists, even though notably using pre-compiled lists from one 3rd party is much easier to implement than what I've done, which is automatically picking the correct registry for countries, and implementing (currently only 2 but vastly different) mechanisms for fetching and parsing their lists.

[The lists from RIPE come in json format, and you get a list for the specific country you requested. While ARIN provides only one list for all countries covered by the registry and in a completely different plain-text format, which I had to implement a custom solution to parse - in fact, that solution got implemented 3 times, with each iteration working a few times faster than the previous one. You can check out the -fetch script in my github repository to see how this is currently working.]

Yes, you are wrong... :wink: a RIR knows whom it handed out IP address ranges to, it does not know where these are actually used... also in the EU RIPE is the relevant registry and if say Deutsche Telekom gets an IP range it might use this for any of its European subsidiaries as far as I know...

Because networks/AS do simply not follow country boundaries...

Really all geoIP is a heuristic, which, and let me repeat that, can be useful under specific circumstances. It is not a ground truth about the location of the computer using a specific IP address...

So, I am not in any way opposed to your project and consider it pretty cool that you went the extra mile to make this work under OpenWrt, I just want every potential user to be aware of the limits of geoIP in general.

2 Likes

Fair enough. I didn't know that ip ranges get misused in the way that you described. And thank you for that information. I haven't noticed any actual inconsistencies between the data from the registries and the situation on the ground, but then I obviously haven't tested every individual subnet and where it is physically being used. I believe these situations should be pretty rare but it's useful to know anyway.

It provides the following features to populate IP sets:

  • Support for DNS, CIDR, ASN, GeoIP.
  • Support for both IPv4 and IPv6.
  • Support for nftables/fw4.
  • Automatic updates with crontab.
  • UCI configuration and integration with firewall.
  • Flash wear leveling.

Keeping it lightweight is the point.

This is delegated to the firewall service:
https://github.com/openwrt/firewall4/blob/master/root/usr/share/ucode/fw4.uc#L1838

I can't really test banip on my router since it's running an older version of OpenWRT which banip is incompatible with. However, looking again at the banip readme, I still can't figure out how to do geoblocking. In fact, the sequence "geo" shows up in the readme exactly 1 time, which is where they say that they support geoip blocking. But how is this done, and how does one configure it? That I can't figure out from the readme. Maybe it's all in the Luci interface which I can't access for the abovementioned reason.

As to keeping it lightweight, I do see that it supports a minimum of 256MB RAM. While my project happily works with my 128MB RAM and there's a lot of memory to spare, even if I configure a blacklist/whitelist for many-many countries. The one thing that I can think of which prefers relying on a 3rd party for ip lists is that it doesn't need to be parsed (?). So that is a plus in the efficiency department. In my implementation, a huge list, say for US, which is fetched from ARIN (which notably spits out a huge list for all countries under its wing, with a substantial percentage of lines which is even not assigned to anyone) needs to be parsed, and parsing that particular list takes about 15 seconds on my 11-years-old router. On my x86 (similarly aged) machine, parsing takes about 0.3 seconds. Smaller lists are much faster, of course. Assuming you are blacklisting or whitelisting a huge number of countries (which is probably not the typcial way people do geoblocking), parsing on a very slow CPU like the one in my router may take a minute. So that's a minus for my implementation. Now, does it really matter if done once at installation and then once a week, at 4am local time when the user is probably sleeping? I'd say that no, especially considering that the pre-conditions are very specific (old CPU, specifically ARIN countries, and a large number of them).

[Plus, now thinking about it, I understand that parsing can be actually sped up significantly when parsing multiple ARIN countries, so it should eventually work much faster in this scenario and I'll implement this]

So which solution is more lightweight? I'd argue it's the one that doesn't use as much memory, while taxing your CPU for a minute once a week. But it's a choice that a user could make, provided that the user has a choice.

On the other hand, if speaking of reliability, you'll have a hard time convincing me that relying on a 3rd party is better in that department than relying on official regional registry. The registries are going nowhere for any foreseeable future, and they must stay online and available, and provide accurate data. The 3rd-party can one day disappear or switch to a paid format, or screw up something in their code and start spitting out incorrect data.

[And we still haven't even addressed the fact that the 3rd party might sell the data it collects from you, for example your IP, therefore undermining your security]

Also - I can't really figure out quickly how the validation is done "in the firewall" (the link you provided seems to be some OpenWRT internals part, rather than the firewall utility which is nftables, but notably I didn't read the code all the way). But let me ask this: what happens if
the 3rd party you are relying on screws up something in the code, and the updated list you fetched contains say half the valid subnets count compared to the previous one? Does it get applied to the firewall and your geoip blocking is now screwed up? My solution safeguards against this, even though I am relying on an arguably more trustworthy source.

Anyway, one other thing that hasn't been addressed by you in this discussion is ease of use. I looked up some pictures of banip Luci interface. While I haven't found pictures of some of the tabs, I did get an impression (again) that it's a complex piece of software that serves many different purposes, and so has a lot of options that are completely irrelevant to geoip, and would be quite challenging for an average user to understand. Now I've also looked at the code more closely, and for what it does, it seems quite impressive. However, I don't get the impression that ease of use (or good documentation) is its strength for someone who (for example) just wants to whitelist their country and block everything else. Which, in comparison, my project does much better. I don't have any luci interface, of course, at least not now, but consider this. To accomplish the abovementioned task, all you have to do with my solution is type in the command bash geoblocker-bash-install -c <country> -m whitelist. That's it. You (as an average user) don't need to tinker with any irrelevant settings, or to figure them out, or how they relate to each other. You (as an advanced user) can still tweak the settings through more advanced options. On top of that, regardless of whether you are an advanced user or not, you get plenty of documentation that cover only the relevant things, but cover them well.

adblock or banip (basically) don't need 'any' RAM for itself to function, what does need RAM are the blocklists themselves (and those can get excessive in size really quickly), that's why the minimum reasonable system requirements are specified as such.

3 Likes

This is partially correct but not completely correct. Looking at the banip code, I can see that it creates quite a lot of temporary files for lists processing. On OpenWRT, tmpfs is on a ramdisk. So that takes up RAM. Then it also loads the contents of some of these files into variables, so that also takes RAM. If it cares to destroy the temp files and the variables immediately when they are no longer needed, then perhaps it's doing its best in that department. If it doesn't then possibly it has multiple copies of the same data taking up RAM (while processing the lists, not permanently - but this doesn't matter for the minimum RAM requirements). I can't tell without analyzing a few hundreds lines of code (which I'm not going to do). I know that I paid special attention to this while coding, so no RAM is wasted at any point. Besides, I'm not sure how nftables lists work in this regard but I can tell that with iptables, when creating an ipset, you can set a couple of parameters for that set to trade off performance and memory consumption. The default parameters are generic and thus suboptimal. I suspect that it's the same with nftables since they both rely on the same kernel framework. Again, my code calculates optimized parameters for each ipset it creates. Whether banip does the same - I don't know.

Overall, I'm getting the impression that people here are not really welcoming an alternative solution that focuses specifically on geoip blocking, because "why not just use banip". I have nothing against banip but I don't really understand why an alternative solution with a different approach, giving the user a choice, is unwelcome. As mentioned in the post, personally I don't really need this project in OpenWRT. I just saw in my testing that it has a good potential to work on it and that got me excited, since I'm an OpenWRT user myself and I appreciate this project. I'll continue following this thread to see if there are other opinions and then we'll see whether this is viable or not.

It seems you misunderstood my previous comment, it was about ipset_extras, although it appears that most part also applies to banIP.

This is linked in my first comment.

According to my testing results, this is also correct for both ipset_extras and banIP when limited to GeoIP blocking, and the overhead against fw4/nftables seems negligible.

That amount of RAM is necessary for use cases not limited to GeoIP blocking.

It is not just hardware resources but also technical debt, since the amount of necessary support is often proportional to the amount of code.

I agree that own service is more reliable, but there's often so many third-party services involved including DNS, NTP, DDNS, IPv6, VPN, and other providers, that adding one more does not make much difference.

It should provide syntax validation against the type and family of the nftables sets, however the goal of ipset_extras is only creating and populating IP sets, while logical analysis including possible whitelisting is delegated to the one creating firewall rules which utilize the IP sets.

This is possible, provided that your project is focused on GeoIP blocking, on the other hand ipset_extras and BanIP have a broader scope of application.

That's a misunderstanding, think of it as a comparison with existing solutions that offers each one an opportunity for improvement.

1 Like

I'll quote your 1st comment in this thread:

If you need GeoIP filtering, just use banIP etc.

What I'm reading is "We already have something better in OpenWRT". This did not sound like "This project could be good for the GeoIP blocking specifically, and here's a starting point to integrate it when you have nftables and ipv6 support".

This is true. And also: a x86 CPU has a broader scope of application than a RISC CPU. So why do we even need RISC CPU's in our routers? Well, apparently a broader scope is not always better. It all depends on what you (as a user) actually need.