Mark wiki.openwrt.org for robot exclusion?


#1

Google still insists that the wiki.openwrt.org links are the "best", rather than the current pages at openwrt.org. As an example, a search on "openwrt apu2c4" returns https://wiki.openwrt.org/toh/pcengines/apu2

Perhaps changing /robots.txt would help, and/or adding the appropriate META tags to the header could convince Google to direct people to the current pages.

Right now https://wiki.openwrt.org/robots.txt returns

User-agent: *
Disallow: /oldwiki/

#2

I find it confusing to direkt users to the openwrt wiki, but meaning openwrt.org, not wiki.openwrt.org.

Not sure if I proposed already to rename wiki.openwrt.org to oldwiki.openwrt.org, and move openwrt.org to wiki.openwrt.org.
Too late now and too hot to search for this.

Anyway, we would need sysadmin support for that.


#3

The suggestion about robots.txt would be a good/ easy start, redirecting the individual pages via HTTP 301 (moved permanently) would be ideal, but very labour intensive.


#4

I'd agree with renaming wiki.openwrt.org to oldwiki.openwrt.org. Note that I do not have administrative access to the old wiki, so any vhost or robots.txt changes need to be made by wigyori.


What would it take to bring back the LEDE web-site/wiki?
What would it take to bring back the LEDE web-site/wiki?
#5

Google still insists that wiki.openwrt.org is the "best" and will likely to continue to do so every time a user reinforces that by clicking on a link returned.

The pages returned still suggest that they should be indexed and robots.txt still suggests that the now-abandoned wiki is still to be indexed.

User-agent: *
Disallow: /oldwiki/
<!DOCTYPE html>
<html lang="en" dir="ltr" class="no-js">
<head>
    <meta charset="utf-8" />
    <title>OpenWrt Wiki [OpenWrt Wiki]</title>
    <script>(function(H){H.className=H.className.replace(/\bno-js\b/,'js')})(document.documentElement)</script>
    <meta name="generator" content="DokuWiki"/>
<meta name="robots" content="index,follow"/>

image


#6

There's a dilemma here. People worked hard to revive the "old wiki" because it had valuable historical info. That info should still be searchable by search engines. So I think we need to preserve the robots.txt file on the old wiki (and not block the search engine spiders).

But... As people point out above, clicks to links on the old wiki reinforce that site's relevance.

Maybe these will help:

  1. renaming the old wiki to "oldwiki.openwrt.org" will dilute those links.
  2. I wonder if it would be possible to change the titles of all old wiki pages to include "[Historical OpenWrt Wiki]" in the old wiki example above)

#7

I understand the value of the historical information to the handful of people that wish to access it. However, with every passing week, it becomes more and more out of date. Many times the information presented was already obsolete compared to "then-current" releases, and could and has lead users to all kinds of problems, including bricking devices. The information on devices that weren't officially supported a year or two ago is often "completely wrong" on the obsolete wiki, especially when flashing techniques and/or MTD partitioning has changed.

Users should be able to search on a major search engine and get a "correct" answer to their question. They certainly shouldn't have to jump through hoops to then go to the "current" wiki and try to guess the name of the equivalent page.

https://www.google.com/search?q=openwrt+failsafe

https://www.google.com/search?q=openwrt+archer+C7

https://www.google.com/search?q=openwrt+tftp+recovery

https://www.google.com/search?q=openwrt+unbrick

Here's a good one

https://www.google.com/search?q=openwrt+telnet -- first result

image

https://www.google.com/search?q=openwrt+login is just as bad and misleading

How many times a day does tmomas or one of the frequents here on the forums have to redirect users who reasonably searched using an Internet search engine for a resolution to their problem to the "right" page?

Short of providing a link at the top of every page on the obsolete wiki to the "new" page (which is a huge effort, as there isn't a straightforward mapping of old to new). I don't think that more "warnings" aren't going to help. Warnings haven't significantly dissuaded people from referencing those pages here time after time. Who knows how many more refer to them, not even realizing that they're looking at outdated information?

As long as those pages are still indexed by the major search engines, they will likely remain at the top of the listings for years to come.

For the handful of people that want access to those pages for historical purposes, the links can still remain valid and they can use the wiki's search or old link references.


#8

@wigyori Can you please assist in moving wiki.openwrt.org to oldwiki.openwrt.org?

As you can see from Jeff's posting, the current situation is really confusing and annoying. We are talking about "the wiki", meaning openwrt.org, but users searching for "openwrt wiki" and other stuff frequently find wiki.openwrt.org, which is not updated any more and is getting more and more outdated every day.

Your help is really appreciated!


Google indexing issues
#9

Could you please also change the maintenance message to a link to the main site not just the text https://openwrt.org.

&lt;div class="maintenancemessage"&gt;This wiki is read only and for archival purposes only. &gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt; Please use the new OpenWrt wiki at https://openwrt.org/ &lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;/div&gt;&lt;!-- TOC START --&gt; &lt;div id="dw__toc"&gt;


#10

The method with which this sitewide notice is generated does not allow linking (or any other kind of formatting).
If it was possible, I would have already done so :slight_smile:

With newer / different templates there is also an option for "sitewide notice", but unfortunately this is not available in the (quite old) OpenWrt template.


#11

@jeff I am not disagreeing with you that the oldwiki should be deprecated. If the consensus is that the value of the historical information has shrunk considerably then we should move quickly to deprecate it.

@tmomas Is it possible to change the "name of the wiki" (in [ ... ]) in the <title> tag to say something strong like "Historical OpenWrt Wiki", or even "Historical Wiki" (to remove the name of the project, decreasing its strength in searches)?

@all - do we have consensus that we should assertively deprecate the old wiki? If so, we could:

  • Change the name to "oldwiki.openwrt.org" (already requested)
  • Change the oldwiki title to include [Historical Wiki] or some such phrase
  • Change robots.txt to exclude spidering
  • Add a "Historical" page to the current wiki that explains the state of the old wiki, and how to find it, and how to use its built-in search facility.
  • Fix references to "wiki.openwrt.org" in the current wiki (there are a few dozen)
  • What else?

#12

Seeing as the old wiki will probably never be updated to anything else again, why not directly hack a message into the output?

I would want to make a case for wiki.archive.openwrt.org, in line with the similarly "deprecated" forum on forum.archive.openwrt.org.


#13

Done. See https://openwrt.org/historicalwiki

This page needs a home, though, and links from other pages. And changes welcome!


#14

Can be done.

What about

[OUTDATED OpenWrt wiki]

?

This needs someone with ssh access (which I do not have for the old wiki).


#15

Brilliant idea which got me thinking.

Done (at home in my demowiki):
grafik

  • clickable link
  • all html formatting options available

Now we just need someone with ssh access to the old wiki (a small html file needs to be placed inside the template directory).


#16

Hmmm... That's pretty clear. Kind of like a club to the forehead :slight_smile:

I'm looking beyond the current effort to drive people to the new wiki. After we've switched the search engines over to the new wiki, what will people see if they intentionally seek info on oldwiki.openwrt.org?

And can we change it now, so we don't have to think about it in the future? Perhaps we could have:

  • [Historical OpenWrt Wiki] in the title
  • The yellow warning could say,

    This is an archive of the original OpenWrt Wiki. It was last updated in February 2018, and is now read-only. You can search this historical archive using the Search function above.

    Please use the OpenWrt Wiki at https://openwrt.org for current information about the project.

Doing this will keep the site's content correct for all time. The "February 2018" date will help readers understand whether it would be useful to them. (And as that date gets farther in the past, it will be even more obvious.)

It will be like a quiet, charming, little used library where historians (and archaeologists) can come to do their research.


#17

Some playing around:

grafik


#18

I think that can be a bit confusing for newcomers and perhaps have them lost already on the search engine. What I mean is: If I was looking on Google and the like, and saw a “OUTDATED” I wouldn’t click the link in the first place and thus never see the notice of where to find the new wiki, or even know that there is a new wiki.

Therefore I propose the title should be “[OLD OpenWrt Wiki]” as this is short and simple to understand and implies that there is a new one.

EDIT:

Or perhaps even better “[FORMER OpenWrt Wiki]” as this indicates there’s a new one before even clicking on the search result.


#19

Proposals so far:

1 [ARCHIVED OpenWrt Wiki] http://wiki.archive.openwrt.org/
2 [FORMER OpenWrt Wiki] http://oldwiki.openwrt.org/
3 [Historical OpenWrt Wiki] http://oldwiki.openwrt.org/
4 [OLD OpenWrt Wiki] http://oldwiki.openwrt.org/
5 [OUTDATED OpenWrt wiki] http://oldwiki.openwrt.org/
6 [PREVIOUS OpenWrt Wiki] http://oldwiki.openwrt.org/

#20

:+1: on consistency of @takimata suggestion

Of the various titles, the strongest "don't read this" one for me is "outdated"