Page MenuHomePhabricator

Protect the reverted edits feature from abuse
Closed, ResolvedPublic

Description

The design for the reverted edits feature can be found in T254074: Implement the reverted edit tag. It isn't merged yet, so there's still time to change a few things. When discussing the feature with @kostajh and @Catrope, we found that there are quite a few opportunities for abusing the system by vandals.

What can go wrong?

This comes from a person who has spent quite a lot of time fighting really nasty and desperate vandals. It may a bit pessimistic.

  1. A vandal can undo a large amount of edits, potentially even entire pages' history using the undoafter= parameter. That would be rolled back later by a moderator, but the reverted tag is permanent and the entire history would be "reverted". This would render the feature utterly useless. There is also no way for an admin to remove these tags, they are software tags.
  2. A vandal can do false undos: first, start a large undo using the undoafter= param and then instead of saving, copy the top revision's content to the edit field, fix a comma and save the edit. Effectively this edit was legitimate (fixing a comma), but it was marked as an undo reverting a huge amount of edits. This is bad.
  3. Someone doing a personal vendetta against a certain user can willfully undo (or false undo) their edits on many pages, even intermediate edits (undo allows that). That would make the targeted user have lots of their edits marked as reverted. Don't laugh, even people on Wikimedia wikis do childish stuff like this all the time. I really would not want this feature to become a new playground for immature editors. That would be very sad.
  4. Not only undos can be exploited (though they are easiest), rollbacks and manual reverts leave a permanent record as well.

Why is this bad?

When vandals can influence the feature heavily, it stops being useful to anyone and the communities will just give up on it. However, if the feature is only mildly abused, the communities may be tempted to use it in many areas (like hiding reverted edits in page histories and on RC, counting only non-reverted edits of users when determining their eligibility for certain user rights etc.) – in such a case abuse would compromise it and potentially lead to conflicts in the community.

IMO we should only introduce it if the are sensible safeguards in place.

Possible mitigations

Here I list some ideas on how to fix it, not necessarily ideas that I'd support fully :)

  1. Allow only users with a certain user right to set the reverted tag.
    • Pros: It would certainly limit the abuse to a minimum.
    • Cons: This would make the reverted tag a bit inconsistent – legitimate reverts made by non-moderators would not apply the reverted tag, which seems a bit odd. It would also make writing a maintenance script for populating the reverted tag on old edits (see: T258921) close to impossible, as it would have to figure out what user rights people had in the past based on the logs. That would be a mess.
  2. Apply the reverted tag only on relatively shallow reverts, let's say 8 edits were reverted at most. If the revert is deeper, nothing would be marked as reverted. The maximum depth can be configurable.
    • We could even limit this to 1 edit, to cover only the most common case of reverting. That does seem very incomplete, though.
    • Pros: This would limit how much damage can a vandal do while covering most legitimate use cases.
    • Cons: It would be inconsistent with what EditResult says to extensions, as it doesn't have such a mechanism. It would also not solve the problem entirely, just limit the speed at which a vandal can do the damage.
  3. Do not mark undos where the user has applied additional changes over it as reverts. Count these as regular edits. Another possibility is an additional flag in EditResult saying "this is an undo, but the user then applied some more changes and we are not sure if it's a legit undo or not", though this would make it quite confusing to extension developers, I think.
    • Pros: It would solve issue no. 2.
    • Cons: This will require some creative hacking in EditPage. That will be fun.
  4. Apply the reverted tag only after the edit was somehow approved. Core could provide a special hook for cancelling job queue submission. Extensions would be responsible for re-enqueueing the job once the edit is "approved".
    • If the wiki uses ApprovedRevs or FlaggedRevs, use that. Once the edit is approved, its reverted tag update is sent to the job queue. I'm not sure if Extension:Moderation would require some special treatment as well.
    • If the wiki uses patrolling, use that. Users with autopatrol right have their reverted edit updates enqueued right away, while others have to wait for someone to patrol their edit.
    • Pros: It would solve most issues. I think.
    • Cons: A bit complicated to implement and requiring some additional coordination with extensions. Also: wikis that use patrolling, but not really (i.e. many changes never get patrolled) may be a bit confused by this feature. Anyway, even if the community completely ignores patrolling, this would behave like mitigation #1, which is not terrible actually.
  5. Per suggestion from @DannyS712: we could allow the tag to be removed manually (and perhaps added manually in case it was accidentally removed incorrectly).
    • Maybe we could allow the modification of all software tags by users with a special user right? This can currently be done only by the system administrator manually. That is beyond the scope of this task, though.
    • Pros: This would allow admins to correct mistakes made by the software. Any kind of abuse could be mitigated manually.
    • Cons: Would require custom handling in the ChangeTags class to allow for this with a system tag. This would also be quite annoying to admins if it was to be done routinely. Maybe we could keep it as a last resort that should only be used in rare cases?
  6. Or… completely rethink how edits, reverts and page history works. Make it more like Git, less like a one-dimensional timeline.
    • Pros: It would solve everything.
    • Cons: This is basically a rewrite of huge parts of MediaWiki. Extremely hard (or even impossible), but I'm including this here to show that the reverted edits feature is tough to implement properly due to the way MediaWiki was designed.

I would personally go with #3 and #4. They aren't the easiest, but I think they should provide minimal safeguards for the feature. This will also require considerable development time, so I'll probably have to reschedule my internship a bit.

Any comments or suggestions on the topic are very welcome.

Event Timeline

...also, could allow the tag to be removed manually (and perhaps added manually in case it was accidentally removed incorrectly)? Would require custom handling in the ChangeTags class to allow for this with a system tag

...also, could allow the tag to be removed manually (and perhaps added manually in case it was accidentally removed incorrectly)? Would require custom handling in the ChangeTags class to allow for this with a system tag

Yeah, that's a possibility as well, though I can imagine this will be quite annoying to admins – a lot of clicking and not much of a profit. I'll write this down in the task description :)

Thanks for writing this up. I agree that #3 and #4 sound like the best ways forward. cc'ing @daniel in case he has thoughts on this as well.

Change 616815 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] EditPage: don't mark "dirty" undos as undos

https://gerrit.wikimedia.org/r/616815

Per commit message, THIS CHANGES THE BEHAVIOR OF EditPage IN A NOTICABLE MANNER.

Per commit message, THIS CHANGES THE BEHAVIOR OF EditPage IN A NOTICABLE MANNER.

Yes it does, I think it's justified here as it would be more consistent if the same rules applied to the mw-undo tag, mw-reverted tag and extensions using PageSaveComplete hook. An alternative to that would be a special flag in EditResult, but frankly that would be just mostly confusing. (BTW it should be "noticeable"…)

I think it's okay here – edits won't be marked as undos only if EditPage can't guarantee that the edit really is a meaningful undo.

Per commit message, THIS CHANGES THE BEHAVIOR OF EditPage IN A NOTICABLE MANNER.

Yes it does, I think it's justified here as it would be more consistent if the same rules applied to the mw-undo tag, mw-reverted tag and extensions using PageSaveComplete hook. An alternative to that would be a special flag in EditResult, but frankly that would be just mostly confusing. (BTW it should be "noticeable"…)

I think it's okay here – edits won't be marked as undos only if EditPage can't guarantee that the edit really is a meaningful undo.

I was just explaining why I was adding the user-notice tag

Oooh, okay, nvm, I didn't see that. My bad. :)

If I understand correctly, this is not yet ready for announcement. For once it is ready, please could you provide an explanatory entry for TechNews. Just 1-2 sentences explaining what the change is. Thanks!

Change 616815 merged by jenkins-bot:
[mediawiki/core@master] EditPage: don't mark "dirty" undos as undos

https://gerrit.wikimedia.org/r/616815

Change 616815 merged by jenkins-bot:
[mediawiki/core@master] EditPage: don't mark "dirty" undos as undos

https://gerrit.wikimedia.org/r/616815

Merged -> ready to announce

If I understand correctly, this is not yet ready for announcement. For once it is ready, please could you provide an explanatory entry for TechNews. Just 1-2 sentences explaining what the change is. Thanks!

@Ostrzyciel can you write the summary?

If I understand correctly, this is not yet ready for announcement. For once it is ready, please could you provide an explanatory entry for TechNews. Just 1-2 sentences explaining what the change is. Thanks!

Edits done using the "undo" link will now be only marked with the undo change tag if the user did not change anything in the edit window before saving. This is to prevent users from marking arbitrary edits as undos.

...but I'm not that great at wording things in English, so someone may come up with something better. :)

@Ostrzyciel I added a draft to https://meta.wikimedia.org/wiki/Tech/News/2020/34 based on your text but with somewhat simplified language. Looks OK? I'm correct in understanding it's part of this week's train?

@Johan I'm not @Ostrzyciel but that draft looks good to me.

@Ostrzyciel I added a draft to https://meta.wikimedia.org/wiki/Tech/News/2020/34 based on your text but with somewhat simplified language. Looks OK? I'm correct in understanding it's part of this week's train?

Looks great, thanks! :) I think there was typo in this draft, I fixed it.

Is the tag meant to remember the fact that someone clicked a particular button, or that the content changed in a particular way? It seems to me it does the former, when possibly it should do the latter, and the disconnect between the two opens the gate for vandalism and/or misunderstanding.

In some cases (infrequently), I save all or part of an earlier version of a page to my scratchpad, click undo, and then paste the altered scratchpad to the edit window in the appropriate place. This is "better" than a "naked undo" in some cases.

Use case: vandal-1 uses Visual Editor to vandalize the birth date in the Infobox of some politician by altering the year. This is a 0-byte change, but because of how VE uses Infobox template data, this shows up as an 813-byte addition, as VE aligns every field value in the long Infobox into faux "columns" by adding blanks before the equal sign.

An undo is required to both signal the vandalism and tag it as an "undo"; however, the old, unaligned infobox may not be desirable. Solution:

  1. click undo on the vandal's edit
  2. copy the Infobox from the "before" (left) window to a scratchpad
  3. restore the original birth date in the scratchpad
  4. paste the (still-aligned) Infobox from the scratchpad over the Infobox in the Edit window
  5. add edit summary and publixh.

The vandal's edit in history shows up as +813b because of all the added blanks.
The undo shows up as (0 bytes), but it is the correct action here.

(More details at Does VE munge white space?; also #VEwhitespaceMunge. ) Note: it's not my intention to conflate that VE issue with T259014; this use case is merely illustrative, and I used it because I'm familiar with it. I imagine that other use cases exist where the content saved is not what clicking the "undo" button places in the edit window, but is nevertheless the "right" thing to do.

Is the tag meant to remember the fact that someone clicked a particular button, or that the content changed in a particular way? It seems to me it does the former, when possibly it should do the latter, and the disconnect between the two opens the gate for vandalism and/or misunderstanding.

Now it's effectively the latter. There are 3 ways an edit could marked as a revert:

  • Rollback which guarantees the page is rolled back to some previous state.
  • Manual revert which is basically the same as rollback, just with a different change tag.
  • Undo, which up until this patch could be any edit. Now only edits that are a result of a 3-way merge are marked as undos, thus there's no way to cheat the system here as well.

In some cases (infrequently), I save all or part of an earlier version of a page to my scratchpad, click undo, and then paste the altered scratchpad to the edit window in the appropriate place. This is "better" than a "naked undo" in some cases.

Use case: vandal-1 uses Visual Editor (...)

Yeah, that's a valid use case and marking such an edit as "undo" may be okay. The problem is there's absolutely no control of what's pasted in the edit field, so it can be something legitimate (as you described) or something completely different. It is also possible declare that such a "false undo" undoes dozens of edits where in fact it just fixes a comma. I think the potential for abuse is just not worth it, it's better to keep the undo tag for cases where we can be sure the edit really is an undo.

An alternative approach would be to still mark these "dirty undos" with the undo tag, but add an additional flag saying "we have no idea whether it's an undo or not, we just mark it this way", but this would make the feature even more confusing, as some undo tags would apply the reverted tag and some wouldn't. I just think simpler is better in this case :)

Change 620896 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] EditResult: update undo-related docs

https://gerrit.wikimedia.org/r/620896

Change 620896 merged by jenkins-bot:
[mediawiki/core@master] EditResult: update undo-related docs

https://gerrit.wikimedia.org/r/620896