Can we ban AI for creating Wiki pages?

Darin755 · January 31, 2025, 9:46pm

If I wanted AI I would've just asked a AI chat bot. Wiki pages should be written and verified by humans. AI is easily accessible to anyone who wants it and is not terribly useful for a Wiki where the information should be factual and from experience.

For instance, this page it totally unhelpful to me:
https://openwrt.org/docs/guide-user/network/wifi/mesh/80211s

How is a user suppose to know what is to be trusted?

Darin755 · January 31, 2025, 9:48pm

For 802.11s information you can reference this page: https://openwrt.org/docs/guide-user/network/wifi/mesh/802-11s

enmaskarado · January 31, 2025, 10:56pm

The mentioned page has an absurd edit history, it is two users fighting for days adding or removing information that in my opinion is clearly AI generated.

AI generated text should be blocked entirely on the wiki. When people visit the wiki, they assume it is written, or at least reviewed by people who understand OpenWrt.

As a sample, the page shows information that is clearly incorrect about mesh11sd.

As an aside, regardless of this problem, the wiki needs moderation or some place to at least discuss edits as large as those shown in the log.

LilRedDog · January 31, 2025, 11:24pm

The part that does nothing more than lead you to the official Mesh11sd wiki?

Darin755 · January 31, 2025, 11:53pm

My personal opinion is that project Wiki's should be using Git and markdown to manage information. Allowing any random person to start editing is just asking for trouble. Having markdown in a git repo makes it easy to write docs while still providing oversight and review of changes. Projects such as OpenWRT and Samba are from a younger internet.

That's just my take though. I do think obvious AI junk should be removed.

brada4 · February 1, 2025, 12:07am

Wiki is 20 years older than your markdown and ten years more than your git.

slh · February 1, 2025, 1:23am

The wiki already has relatively few contributors, while I totally understand the benefits of a git based workflow, doing this would seriously raise the bar for contributions - probably beyond the breaking point. It is important that users of the corresponding hardware can fix issues easily (and I realize that the sign-up process for the wiki is not exactly forthcoming already, a necessary evil, thanks to commercial spammers).

Obviously ai generated content is not welcome, both for legal reasons (copyright/ attribution requirements) and most of all technical ones, we certainly don't want a feedback loop of ai hallucinations and blatant errors. Completely preventing ai parts is sadly not very likely to succeed, the spectrum is flowing from users relying on ai to 'improve' their writing style, help with translating from their native tongue and letting it write the technical aspects unchecked - while neither is welcome (and the later very much isn't), it's hard to effectively police the former (so there is a grey area here).

psherman · February 1, 2025, 7:03am

I am not a wiki admin, so I'll tag in some of those that are wiki admins. IMO, everything that @slh said is spot on.

But I think that it's best if the wiki team can speak from their perspective, especially when it comes to handling the practical angles (ranging from translations to review processes to detection and everything in between).

@thess @jow @aparcar -- are any of you able to chime in here.

slh · February 1, 2025, 8:46am

Let's ping @bluewavenet as well, as he has authored a lot of this wiki page over the last few years (without ai involvement) to get his perspective on the current state of it as well (the prose part ticking all buzzword topics is bad).

bluewavenet · February 1, 2025, 2:46pm

Thank you @slh for the ping, although I had already seen this thread and was "sleeping on it" before responding.

First of all, I am sure most if not all developers will agree with me that documentation, although very important, is often a difficult and onerous task consuming a great deal of time and effort, time that is always in short supply.

The wiki approach is very useful as it allows a dev to very quickly add at least a documentation template enabling that dev and experienced users to contribute to making the wiki based documentation better. It usually works very well and is "self moderated" in that changes can and should have a reason included in the edit (these are hidden from the actual document). This generates a notification, if configured, to the original author and subsequent contributors.
A change can then be checked and and is approved by doing nothing, or accepted with changes by further editing. It can also be disapproved by reverting to a previous version. This is in some respects similar to a "git" process I guess.

Using "AI" as a research tool is something quite new, and here we need to be very careful.
Without going into any tech details, we have "AI - Artificial Intelligence", a generic term and we have "LLM - Large Language Models", a training method where written language sources are absorbed (eg Internet) for use by some AI.
So an AI service based on an LLM will have lots of data, but no means of knowing what is real/relevant data.
Basically a new type of search engine that outputs in nice sentences, hopefully with references.

Very useful and can make writing wiki pages a little easier by giving the author a list of references on the subject as a starting point, nothing more.

The problem is, as we can see, someone could, for their own reasons completely replace an existing wiki with a LLM generated page with little or no verification.

Reasons for doing this are inevitably selfish and dishonest.
An example could be abusing a wiki to publish something to put on your CV.
The perpetrator could be lucky and choose a wiki that does not have an active author or list of editors with notifications set up.

The situation will then be amplified by LLMs lapping up the abused wiki page and repeating it over and over.

This abuse happened with the page in question. Rejecting/Reverting the obvious LLM generated edits only resulted in verbal abuse and a re-edit.
Eventually I added the current RED warning on that page:

WARNING! This document contains many errors and misconceptions.
Many paragraphs are produced using an online LLM and added without any verification

This page should really be removed, in my opinion.

I had to resort to creating a new wiki page containing my original contents.

Can we ban AI for creating Wiki pages?

AI does not (yet) create Wiki pages, at least not on the OpenWrt Wiki as far as I can see.
We should use AI as a research/translation/formatting tool, and it is only going to get better, probably very quickly.

The real issue here is the mechanism for "self moderation" depends on contributors working together.
Clearly some "contributors" have their own agenda.

A solution would be when an edit is made it is not immediately made live if the editor is not the original author. That author should get a notification and then be able to review the change. If the author does not respond for a set time (maybe 7 days?), the changes are automatically published.

If an editor gets a set number of edits rejected, they should be blocked automatically from further edits.
Someone who is trusted and contributes a lot could be flagged as an "author" by the original author.
In this way, there would be no need for an admin to get involved, unless the original author fails totally to respond for whatever reason.

I do not know if this solution, or something like it is supported by the Wiki software. Someone else with a detailed knowledge of it would have to chip in.

hecatae · February 1, 2025, 11:41pm

Have you got any more examples on other pages?

Darin755 · February 2, 2025, 6:54pm

The world has changed considerably in the last 20 years. Old doesn't necessarily mean better.

Darin755 · February 2, 2025, 7:13pm

@bluewavenet

My biggest concern with AI is the fact AI models get worse when trained in AI content. Ideally the OpenWRT Wiki should be human written so that bots can ingest it and then provide good answers to people asking about networking and OpenWRT. If a page is readable by an LLM it is probably going to be readable by a human as well.

Sources:

https://www.nature.com/articles/d41586-024-02355-z

I do think AI could be used for reviewing pages. (Easier said than done) If nothing else I really like the idea of adding a delay to the number of edits in a 24hr period.

@slh

I think a git based workflow would force people to spend time on a change and could result in better content. I get that there are limited number of contributors and I think the biggest barrier is the bar to entry. Obviously moving to untested system would be bad in general for OpenWRT and other projects but I do think we need to step back and reevaluate community Wikis. If you haven't noticed a lot of the older community Wikis have been almost abandoned with pages that are years out of date. (I am speaking about public Wikis in general) I would love to help write Wiki pages but I honestly don't know where to begin.

bluewavenet · February 2, 2025, 7:42pm

100% agree.

That is true.... But:

Also, probably, 99.99% of devs probably feel writing docs is a pita.

A none Wiki step in the right direction, at least for packages, would be to include comprehensive documentation on the source repository, maybe simply in the README.md, or as formatted text in a docs folder.
If not provided, new versions of packages should not be merged into OpenWrt as a matter of principle. Currently this is not a consideration.

The existence of a README.md of significant size and/or the existence of a docs folder with significant content could be added to current CI tests that run when a PR is opened. This would not add significantly to the workload of reviewers.

lantis1008 · February 2, 2025, 9:33pm

The problem is that git is just big enough of a barrier to entry that users won’t bother.
WYSIWYG wiki editors are a far cry from having to write syntax and commit/push it.

There’s already a deficit of good code contributors.

bluewavenet · February 2, 2025, 10:30pm

Yup 100%
But pulling package documentation from a repository and copy/pasting it into wiki editor is a different matter as it should provide a definitive reference, as I suggested earlier.

richb-hanover-priv · February 3, 2025, 1:24am

A few thoughts here:

I completely agree with all that @bluewavenet said:

Documentation is hard to write
Wiki format is good way to create a (relatively) static source of information
AI is a great research tool, and definitely could be used to provide an outline on the subject
But it must be backed up personal verification

I imagine the forum moderators could come up with a policy recommendation and then send it to the OpenWrt committer's for a final decision. I would advocate for something like a "No Generative AI content without someone's personal statement that they have verified the info". Not sure how to enforce this, though.
I have used a Markdown/Git system, and frankly, it's a PITA. It absolutely discourages people from contributing, both because git is a hurdle, but even more because it gives off the vibe that "regular people aren't welcome" - only special people / admins can contribute.

moeller0 · February 3, 2025, 9:03am

Something different, could we not just revoke the wiki editing privileges for problematic editors, at least temporarily?

frollic · February 3, 2025, 9:09am

this would get my vote.

richb-hanover-priv · February 3, 2025, 12:43pm

I am often accused of complicating things, but I think this would be premature. On the Forum, there is an accepted set of guidelines and a crew who enforces them. I don't believe that that same "human infrastructure" is in place for the Wiki.

At a minimum, we need:

A reasonable policy statement. I'll draft one in the next message. (It might apply to the Forum as well.)
Agreement from OpenWrt Committers who are the overall "governing body" of OpenWrt. (I don't imagine this will be a hurdle.)
A group of people who have the knowledge and rights on the the Wiki to enforce it

Thanks