HoneypotFrontPage
A HoneypotFrontPage is a FrontPage that intentionally left open for editing to attract WikiSpam bots, so they can be easily detected and banned. It’s meant to be a spam prevention feature that’s simple and automagical.
The idea is based on these basic but important characteristics about WikiSpam and FrontPages:
Most admins lock their wiki FrontPage to prevent WikiSpam and WikiVandalism. However, by intentionally leaving the FrontPage open for editing to attract WikiSpam bots to spam it, it’s possible to detect WikiSpam as they arrive and then proceed to ban their IPs from editing any other pages for a limited time.
Possibilities
- # Now talk about some more specific ideas on how to implement the idea.
- Maybe some notes on the User Interface, or, on a technical level,
- how the idea can be implemented.
Basic Use Case
Here’s a basic use case of this feature:
- The admin makes the FrontPage to be editable by anyone, but with big warning message that says something like
DO NOT edit this page or be banned for 30 mins and lose all of your changes made to the entire wiki in the past 5 mins - WikiSpam bot arrives on FrontPage and added spam links on it
- As soon as that happens, WikiSpam’s IP address is recorded and banned for 10 mins
- The FrontPage is reverted automatically
- WikiEngine proceed to check the RecentChanges for any edits made from the same IP in the last 5 mins and reverts them.
(optional) Additional Cleanup
- A diff is made before reverting the FrontPage
- the diff is then scanned for URLs the spammer added, e.g.
http://somespamsite.com/ - Searches the entire wiki for the existence of this URL
- Alerts admin of all pages containing the spam links, and possibily reverts them automatically if they were added recently and/or from the same IP
Advantages
- Automagical
- This mechanism should detect spam as it happens and remove them accordingly, with very little human intervention
- Low Maintaince
- No huge blacklists (like BadContent) to maintain and download periodically, no need for central authorities for maintainence
- Low Impact
- It doesn’t scan all pages, every time they’re edited. Normal pages are not scanned unless an intrusion has been detected recently. And even then, only recently-edited pages should be scanned
- Users can edit other pages as usual. It should be entirely transparent to the user.
- IPs are only banned temporarily, spam URLs are only recorded for a one-time cleanup and than discarded (rather than blacklisting the URL for all future edits) The impact of false positives should be minimal
- Behaviour-centric (rather than content-centric)
- It targets bad behaviours (content editing by non-human bots), rather than BadContent
- A content-centric detection method only works when someone has already known and decided what’s spam and what’s not. New, unknown spam links slip by content-centric detection. HoneypotFrontPage can clean up even new, never-seen-before spam.
- HoneypotFrontPage avoids subjective, possibly controversial decisions on what’s a good link and what’s a spam link.
- Soft Security (See MeatBall:SoftSecurity)
- It’s not entirely non-violent, but it’s far from authoritarian.
Implementations
| status | wiki engines |
|---|
| Implemented | - |
| Developing | - |
| Intend to Develop | - |
| Considering | - |
| Rejected | - |
Activity
Terminology
Problems
- A human user may have missed the warning message on the FrontPage and edited the FrontPage in good faith, then loses all their recent edits elsewhere as a result
- Any edits made recently by other people from the same IP will also be lost. This includes people on the same corporate network or happened to be connecting via the same anonymous proxy as the spambot (this can be a big issue since most spambots do use anonymous proxy servers for obvious reasons).
Due to these reasons, only very recent edits (say, in the last 5 mins) from the same IP should be reverted automatically.
Alternatively, the WikiEngine could be more aggressive, and scan all edits in, say, the last 30 minutes, if it limits reverts to only reverts pages where a spam link was added.
One possibility of providing a safety net is to show the careless user a list of pages that was reverted. A spambot will disregard the list (or try to spam every page on the list, which will be rejected). The careless user in good faith will try to revert all the pages, even if they’re someone else’s (other people connected from the same proxy server) changes. MoinMoin has separate permissions for `revert` and `write`, so it’s entirely possible for a user banned for editing to revert pages. On [“WikiEngine”]s with no separate `revert` rights, the careless user will have to wait no more than 5-10 mins until his ban expires, which is a pretty reasonable wait.
If this method becomes popular, spammers may rewrite their spambots to skip the FrontPage entirely and post spam to other pages only, hence render this useless. However, nothing stops the admin to setup other honeypot pages on their wikis as a workaround.
(should we rename this feature simply HoneypotPage ?)
See Also
- A commonly sugguested antispam method is URL blacklisting (see LinkBan). HoneypotFrontPage fixes some of the flaws in that method.
- RejectDuplicate is another suggested feature that targets behavior, rather than content. (Do we need a page to talk about TargetBahavior in general ?)
A copy of this page is at http://moinmoin.wikiwikiweb.de/HoneypotFrontPage .
Contributors
CategoryFeature CategoryEditing