OpenWrt Forum Archive

Topic: Proxying Google's tentacles?

The content of this topic has been archived on 2 May 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

I'm currently living in China and I'm looking for some help with finding a solution to de-google my internet. I'm not very well versed in OpenWRT, so I'm not 100% sure this can be accomplished (or if anyone has done it already)

I've noticed over the past couple of years living here that more and more websites outside of the firewall will load stuff from Google (analytics/fonts/ajax/AMP etc.) and it's leaving the internet more and more broken for us here. Some of this I can just block (like analytics and ads) but a lot of it is necessary and more importantly could be loaded from alternate websites. I guess the real issues is that webmasters are not adding JS fallbacks to have resources loaded from somewhere other than google (or hosting the resources themselves). So I'm looking for some solution, ideally on an OpenWRT router, that will redirect the javascript/fonts requests away from Google to some alternative hosts (a local cache would also be fantastic - but I don't want to ask for too much smile ).

I know I probably don't have a very popular use case, but I'd have thought someone has done it for privacy reasons by now as well

PS: I know I can "just use a VPN", which I do for now, but they're unstable, have terrible support and have other issues. At the end of the day the way I see it, VPNs are a bit of dirty hack and I really don't like using them

I'm not the best person to answer you here but seeing as nobody else has.....

Just to clarify... You want to make sure that any websites you access from your own network don't have Google links coming in?

You're not talking about Google's search engine looking at your own hosted website?

Sounds like a squid proxy cache 'might' work (available on openwrt provided you have a powerful enough router). Also, who knows when that content might change.

Same with an adblock, hosts file update or other. You may be able to block a band of known sites but these will probably change over time. There may be a site that keeps track of known adwords/google etc sites. I'll have a look.

I can give you a more concrete example smile

I got to stackoverflow. com. It loads and it sorta works, but some parts don't work correctly in China.
So I open up the source (Ctrl-U) and line 20 reads

<script src="googlestuff/ajax/libs/jquery/1.12.4/jquery.min.js"></script>

This will not load from here (and I guess for the privacy conscious.. this will send a request to Google..)

I got to the JQuery website: jquery. com/download/
and you can find that this javascript thing is hosted in a few other places (run by Microsoft and the like)
like for instance ajax. aspnetcdn. com/ajax/jQuery/jquery-1.12.4.min.js
or
cdnjs. cloudflare. com/ajax/libs/jquery/1.12.4/jquery.min.js

So what I'd like to see happen would be each time the router sees a request for, for instance, this jquery thing, it would get it from some alternate host. I'd guess maybe some DNS magic to have googleapis. com resolve to some alternate IP? More concretely, in the second alternate host I posted the file paths match, so maybe I can get ajax.googleapis. com to resolve to the IP of cdnjs.cloudflare. com ? I'm not sure if that would work or how I'd go about doing that

PS: I had to break the links b/c it's not letting me post links (I guess an antispam feature)

In the pre-https internet, a proxy could have done this on the fly. With the strong push for https everywhere this is no longer possible - and the multitude of layered external inclusions in today's would make this a losing game as well.

If you use Firefox (or a similar browser), would the Decentraleyes extension help with your requirements? It purports to intercept requests to content delivery networks and deal with them locally. I don't know - I haven't read - if Google is handled by Decentraleyes or not. But it might be worth investigating.

Thanks 600cc. Gunna look into it. Looks kinda like what I need (thought


I actually also just found this file: gist.github. com/gaoyifan/680da074330d2c499d6b
Skimming it, it seems to do what I want, but I don't really get what it's a configuration file for (nginx?). haha. Anyone know? smile)

(Last edited by geokon on 30 Mar 2018, 18:13)

That looks to be an nginx configuration file, judging by the syntax.

Unfortunately, slh is correct.
In case, you only would have to deal with http:, willing to endure quite a lot of work to detect the working replacements for googles blocked URLs, you could configure squid (proxy) to replace these URLs.
Assuming, that the replacements themselves would not reference a blocked google URL again, you were set.

But you can not do this for https easily, may be, even not at all.
In case, you have access to all the clients, you can intercept (most of) https using squid by installation of private certs and do some kind of redirect.
But to make this work is a very steep learning curve, and browser dependent and becoming harder and harder over time and host dependent and ...

Your idea of using DNS-trick to do the REDIR might work in case of only the hostname has to be (secretly) redirected.
But this also breaks, in case of "Pinned Certificates" for https, which are in wide use for google.com etc.

Most likely, a VPN-based solution is much simpler and much more reliable. Although there are rumours, that the Great Firewall also blocks VPN-connections by purpose. So a good VPN provider might work today for you, but not tomorrow any more.

A theoretical approach just popped up: To modify "Adblock" for firefox. 
Lemme know, if you dare to walk this difficult route.

Just wanted to thank you guys. Guess the easier solution is to use Decentraleyes. I seems to be working for me. It doesn't cover all the usecases - but nonetheless - it's pretty good! HTTPS is a blessing and curse. The firewall does block VPNs, but due to HTTPS they can't block subdomains (so for example, English Wikipedia works and you can access any article you want. Even on "sensitive" topics)

You're welcome. Glad it's useful. I realise it's not perfect, but it sounds like it's better than nothing.

The discussion might have continued from here.