Curious about LEDE web site statistics?

richb-hanover · October 29, 2016, 4:50pm

The LEDE web sites (main site, forum, and downloads) recently began to use the awstats log file analyzer to summarize and display the web traffic for those sites. See the stats at:

LEDE Web Site: https://wiki.lede-project.org/stats/
LEDE Forum: https://forum.openwrt.org/stats/
LEDE Downloads: http://downloads.lede-project.org/stats/

In addition the About page of this forum has brief statistics as well.

RangerZ · October 29, 2016, 11:58pm

This is great data!!!

I think in particular that we should look to see if it's possible to tune the data on the file downloads, or at least be able to funnel this data off line for better stats.

Specifically, as a "feature" in the form of a package may be located under different targets, there should be some way of aggregating the variations of the *.ipk across all targets or instruction sets. This will allow the devs to see what's most popular to the community, which should help prioritize development efforts.

Similar with the targets themselves. What are the most popular platforms, and what can be sunset.

Is it possible to increase the displayed characters in the first column "downloads". It appears to only show the first 70 or so characters, and some of these entries are much longer.

Do we know if these stats account for all download activity, meaning does this include wget and opkg install calls?

I also expect that this does NOT include mirror activity, which is unfortunate.

richb-hanover · October 30, 2016, 2:20pm

Updated original post to say

In addition the About page of this forum has brief statistics as well.

richb-hanover · October 30, 2016, 2:35pm

@RangerZ You're right - these pages provide a wealth of data. They're created by the awstats web server log analysis package to summarize the raw server logs. Here are quick answers to your questions:

The log summary probably does include each wget and opkg install (they'd be shown in server logs)
But... This log analysis only provides suggestive (not definitive) information about what's going on.
They would not show traffic to mirror sites. That would be up to the individual mirrors to provide.
I'm not sure that it's necessary to have an automatic analysis of the popularity of platforms/packages. We're not going to "turn on a dime" and start/stop support for any platform. Reviewing the popularity every 3-6 months (or when it becomes too hard/expensive to support a platform) seems like a reasonable choice.
We can probably tune up the config's for awstats to exclude boring info. For example,

wiki stats shouldn't show hits from /lib/... nor referrers from lede-project.org
forum stats shouldn't show hits from /message-bus...
... etc.

RangerZ · October 30, 2016, 4:53pm

I agree that this data is not needed daily, it's management data and only needed periodically.

What I have learned over the years in my consulting work is that it's often difficult to go back and reformat the raw data and impossible to review what was not collected, so while it may not be needed in the short term, formatting may be relevant.

For example, (may not be a good one) under trunk, do we need to take downloads stats by package x release, package x date, or package x month. Package x release (x user) has no relevance for dev priorities, but may be useful for QA (buggy package). With nightly builds packages by data may be the same as x rel, but not always.

Not read the package specs.

jow · October 31, 2016, 8:57am

I instructed the forum nginx server to not log requests to /message-bus/* anymore so it will fade out over time.

richb-hanover · October 31, 2016, 11:56am

Thanks. I also saw that awstats has a SkipFiles configuration option that seems to do the same thing. http://www.awstats.org/docs/awstats_config.html

RangerZ · November 8, 2016, 7:29pm

I have been looking at the data from this page and trying to strip out just the firmware. I noticed:
1 - There are a lot of download entries for /snapshots/faillogs. I am not sure why any one would download these files. I do not think these are relevant for download statistics.
2 - There are files called kernel-debug. I do not know what they are or if we need to track them.
3 - I do not see any downloads of image (ImageBuilder) Hopefully this is wrong. Not sure how to check.

The column that says "206 hits" are incomplete downloads.

Operating systems are also tallied, however there is a very large number (60.5%) under Others=>Unknown which makes the data much less valuable in my mind. Apparently Windows XP lives on as does the Commodore 64.

I am not seeing a way to view anything but the current month's history. I would expect we can navigate to history and YTD.

thess · November 8, 2016, 10:56pm

Anyone (myself included) looking for build failures on the buildbots and trying to fix things which worked on a limited test or platform and is failing in normal builds. The downloads are the console output from the package build - compile.txt. However, they could be excluded if they are too distracting.

RangerZ · November 9, 2016, 12:41am

I looked at the faillog folders and see how cumbersome it is to navigate through to see the fail logs, but these are statistics of how many of the failogs have been downloaded, not the fail logs themselves. If the log is not downloaded there is no entry. I guess if someone downloads the entire folder there would be an entry for each fail.

I think better would be to have 3 reports, one for each of the folders under snapshots, but there still would be packages under targets..

This is a windows product that I use every once and a while to run through my music folders to build a catalog.
https://directory-list-print-pro.en.softonic.com/ I do have to do a few things with excel to get it how I like it, but if you have access to folders you can get it to give a list of all the fails so you don't need to hunt. Not sure you can generate a link to the file.

hnyman · November 9, 2016, 9:12am

I think that

targets and packages reports could maybe be combined. They are similar "end-user" stuff. People who download the firmware and possible add-on packages.
faillogs are for developers who try to evaluate how the latest changes work or what has broken things. Those directories are more about monitoring for any new item popping up there (and then looking at the new failed compile log) than about downloading the same reports regularly. (There are perma-broken packages, so much of the faillogs list remains constant over time.) Faillogs has a quite different user profile than targets & packages, and the download volumes will always be much smaller. I don't think that statistics about that will have much value.

EDIT: I need to add something here to make the bullets visible in the final message. Otherwise they are visible in preview, but not in final. Bullets can't be the last item in a message?

richb-hanover · November 9, 2016, 12:29pm

I think a bullet list must be preceded by a blank line. Tests:

No blank line, no bullets

sdfasdf
asdfasdf

Blank line - has bullets

sdfasdf
asdfasdf

UPDATE:You're right. I did have to add this final line to make the bullets appear in the real message.

hnyman · November 9, 2016, 12:39pm

UPDATE:You're right. I did have to add this final line to make the bullets appear in the real message

Strange limitation. I wonder if that is from Discourse itself, or from our stylesheet/config/whatever

jow · November 10, 2016, 12:18pm

Sounds like a bug to me, maybe we should raise it upstream

richb-hanover-priv · November 10, 2016, 12:56pm

Filed. https://meta.discourse.org/t/bullet-list-markdown-bug/52757

richb-hanover-priv · November 10, 2016, 9:19pm

Testing, 1, 2, 3...

Blank line before bullets

Bullet Line 1
Bullet Line 2 - no CRLF on this line

richb-hanover · November 11, 2016, 12:35pm

Update: The considered wisdom over there (from Jeff Atwood, no less) is that we have a "CSS problem". From this I draw a few conclusions:

I don't have time/skill to look into this now
This isn't crippling, so let's put this in a list of low-priority curiosities...
... unless someone has the urge to tackle it now

Thanks.

jow · November 12, 2016, 8:12pm

The list formatting issue should be solved, it was indeed caused by one of my local CSS modifications.

List test

tmomas · November 22, 2016, 6:49pm

@jow Can we have awstats also show screensize / windowsize? I would be interested in this information to see how our readers see the wiki.

how to: http://www.awstats.org/docs/awstats_faq.html#SCREENSIZE

jow · November 28, 2016, 2:48pm

@tmomas - sadly no, awstat purely analyzes server side access log files, there is no client capability info available there.