TL-WR1043ND snapshot images - High download numbers - Spanish users needed


#1

The issue

Quite unusual issue where we need the help of spanish residents:

Since 30.09.2018 we see abnormally high download numbers for the TL-WR1043ND snapshot factory image in the download stats, which create a tremendous amount of traffic.

grafik

See also https://downloads.openwrt.org/stats/#countries

grafik

Analysis

Analysis of the download servers logfiles revealed that

  • 99,7% of the 1043nd snapshot downloads originate from Spain
  • 75% Wget; 25% uclient-fetch
    • both without version as "usual" user agents
    • sometimes 100% Wget OR uclient-fetch from one IP, sometimes both at the same time at other IPs
  • approx. 20k different IPs, spread over hundreds of network ranges
  • downloads are rarely completed, instead are cut off at varying sizes; HTTP status however is almost always (99%+) 200, instead of 206 as one would expect
  • downloads originate mainly from 4 different network names (see graphic below)
  • downloads show a significant timely pattern in seconds, minutes, and hours that the downloads happen

Tip for the pictures: to see them in real (big) size -> right click on the picture -> "Show picture"

Download size vs. netname

grafik

Timely pattern: Seconds

Graphing download size (in bytes) vs. seconds shows the following:
grafik

=> two strong peaks around 21..24sec and 32..35sec.

Timely pattern: Minutes

The same for minutes:
grafik

=> Every 10 minutes (at 01 / 11 / 21 / 31 / 41 / 51) download size increases by factor of 2.5 for a time of 6minutes

Timely pattern: Minutes + seconds

green = seconds
blue = minutes
grafik

=> The "seconds" pattern (21/32sec, see above) and the 10min pattern are clearly visible.

Timely pattern: Hours + seconds

It is getting more interesting by the minute... or should I say by the second? :wink:

Remarkable things in this graphic:

  • red: spikes in 3h intervall; seconds peak at 32sec
  • blue: significantly different seconds pattern at 02h / 06 / 12 / 20
  • blue: similar pattern at 02h / 12 / 20; two peaks at 21 + 47 sec
  • blue: different pattern at 06h (almost equally spread over the seconds)

grafik

Preliminary conclusion

Due to the restriction to Spain, the timely pattern and amount of download requests observed we come to the conclusion that this traffic can not originate from single users trying to download the 1043nd snapshot image for flashing their device, but from some bot / script that either does performance testing (user's internet line / OpenWrt download server transfer speed), or checking if there is a new snapshot build available.

Whatever the reason for the described download behaviour may be, we feel that using the OpenWrt ressources (server time + download bandwidth) to this extent for any other purpose than flashing the downloaded image is not appropriate.

We would like to find out the reason for the shown download behaviour, in order to finally lessen the impact on the OpenWrt resources to a normal range (compare stats linked above).

This is when you as a spanish resident enter the game: We need your help in finding the cause of the high download numbers.

Request to spanish OpenWrt forum users

  • Can you think of any reason for the download behaviour shown above?
  • Is there a spanish forum, website, blog, ... which could have triggered this by publishing
    • a speedtest-script
    • check-for-new-snapshot script
    • or ...something completely different?
  • Are you running OpenWrt on one or more devices, and are you knowingly / unknowingly taking part in these periodical downloads?
  • If you are running a continuous speedtest / check for snapshot updates / something else on your OpenWrt device, please check if there are any irregularities since today (404 errors / no statistics / ...)

Thanks for your time and your feedback!

Sidenote: If anybody knows Clifford Stoll: I'm feeling a bit like him now, searching 75ct :wink:


#2

As a workaround (if more fine grained blocking isn't successful):

The tl-wr1043ndv1 is well supported by ath79 by now, so it would be easy to totally remove image generation for it in the ar71xx target.


#3

Spanish user reporting for duty!

I searched Google using keywords in Spanish, but could not find anything relevant. I am currently active in a Spanish forum related to OpenWrt, and have posted this question and a link to this post there. I am searching for other Spanish forums where this could have been discussed, and will post there, too.

Let's hope someone can shed a bit of light about this.


#4

Have you tried to check if the IPs mostly originate from a specific area in Spain? The netnames doesn't seem to be proportional to the size of the ISPs in Spain at all, but maybe they are proportional to the size of the providers in a specific area.


#5

Good point!

RIMA, belongs to Movistar, and is the main provider in all Spain. ONO, JAZZTEL, and ORANGE (UNI2 in the charts) are also large providers, and have customers all around Spain. However, EUSKALTEL only has customers in a specific region, and the number of downloads seems disproportional to the people living in that region.

I'm browsing the forums specific to EUSKALTEL, but so far have found nothing.


#6
  1. Not sure if that's possible. @thess AFAIK awstats can only locate an IP in a country (e.g. ES), and no more detailed location is possible. Can you confirm?
  2. Just checked my IP with https://geoiptool.com/de/ -> "slight" error of just 300km :slight_smile: (accuracy certainly depends on the database used)

#7

Yes, that tool is completely off for the IP I used for testing in Spain. However, Google is usually accurate to within a few kilometres. I suggest taking some random samples and checking them with their API or for instance here: http://www.ipvoid.com/ip-geolocation/. https://whatismyipaddress.com/ip/ is also fairly accurate and it's easier to test because you can generate clickable URLs like https://whatismyipaddress.com/ip/8.8.8.8. I don't know if their geodatabase is publicly available.


#8

Seem to be accurate, but the MaxMind database behind them is quite pricey...


#9

You'll notice if they are mostly located in the same area if you manually check 10 or 20 random addresses using one of those web pages.


#10

Here is a theory;

-Some folks like http://guifi.net/guifi/device/
-Then a regional model difference / localized redirect goes awry due to a file name / size mismatch
-I actually think the model name is a false positive....

Something like the script ( sorry if nothing to do with you guys!!! just an example of whats going on ) https://github.com/QuickMeshProject/qmp/blob/master/packages/qmp-guifi/files/etc/qmp/qmp_guifi.sh
https://github.com/guifi/drupal-guifi/commit/d6169a7c3521968ff0e9b4f9add7b6e075f950c9

or this

Why spain? ... exploit on this community wifi / CUSTOM package overwriting a variable from script similar to above ...... model difference...... uni lecturer made a VM?

Pull the file / geoblock / httpheader(referer?) block etc. me thinks

This is very interesting!!! https://repositori.upf.edu/bitstream/handle/10230/22884/VilchesBlanco_2014.pdf?sequence=1&isAllowed=y ( page 37 )

-The date / initial ramping of demand will give a good clue. ( steep vs narrow slope )
-Client fingerprint
-The incompletiono factor


#11

I added some cities to the stats.

Remarkable: High numbers for Barakaldo, Bilbao, Burgos, Castro Urdiales

grafik


#12

All those cities are within the Euskadi region, precisely where the Euskaltel ISP operates. And those are not the largest cities in Spain, by far. So, there is definitively something weird going on there.

EDIT: More weirdness... there are some major cities in Euskadi, and then the rest is in mostly the east coast, even small villages, I do not see major Spain cities there (Madrid, Barcelona, Sevilla...). Some places like Benicarlo or Santa Cruz de Tenerife are tourist destinations, perhaps the owner of those devices where traveling?


#13

The guifi project is more prevalent in the Catalunya region, but @tmomas findings point to the Euskadi region...


#14

Madrid + Barcelona are already there (look closely) :sunglasses:
Sevilla is missing, but already there is updated data.
I noticed some of the cities are in the vicinity of Madrid.

....in the meantime, I updated the graphic, now showing more city names and sorted them alphabetically, see above.


#15

Still, the great majority comes from the Bilbao and Burgos regions. Can you tell if the IPs seem to belong to fixed or mobile internet connections? If it's fixed then my guess is router software, but if it's mobile then it may be some kind of app.


#16

But they do not stand out when taking the large population into account. If the culprit download triggering system would be widely used there, their proportion would likely be much higher.

As some "never heard of" smaller(?) cities like Barakaldo and Castro Urdiales have higher stats, likely the system is more widely used in those cities.


#17

Do the regional numbers include all downloads, or just the ones using wget/uclient-fetch without version number? If it's all of them then they include a lot more legitimate downloads in Barcelona and Madrid than in the Bilbao and Burgos areas.


#18

Some comments regarding the statistics:

  • they show December 2018
  • they show only 1043nd downloads
  • approx. 99% are wget+uclient-fetch
  • Cities are not yet completely added (only 20% of downloads with city added)

BTW: If someone has a script solution

without running into automatic bot-detection on those websites, please let me know.

Input: bunch of IPs (thousands)
Output: IP-Range, Country, netname, City, static/dynamic


#19

Yes, Barcelona and Madrid are there, but heavily underrepresented, considering their populations. And the same happens with ISPs.

What I mean is that this seems specific to one ISP (Euskaltel) and one region (Euskadi, where Euskaltel operates). The rest of the connections are probably "noise".

Did the number of connections rise suddenly? That would indicate some massive and coordinate update in many devices... for example a firmware update from the ISP.


#20

The underrepresentation of certain regions / IP ranges may be due to only 20% of download requests have added a city.

Madrid may be underrepresented, but there are also some IPs in the vicinity of Madrid (call it "greater Madrid").

I don't think that this issue is restricted to Euskaltel, as I'm seeing lots of other networks.
But yes, a BIG portion of downloads originate in the northern area of Spain where Euskaltel is.

Rise of download numbers: Started 30.09.2018, from low hundreds (like other downloads) to thousands in one day.