Page MenuHomePhabricator

Diacritics in talk pages permalinks cause comments to not being found
Closed, ResolvedPublic

Description

This problem was noticed first at German Wikipedia, with a permalink inbolving a ö. I ran some test at French Wikipedia, as French language has diacritics too.

I took a pair of topics on https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro/25_janvier_2024. These messages were transcluded to https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro earlier this week. I emulated the case of a user who accessed and got access to the message when they were on the main page.

According to @matmarex:

ö is double-encoded in the API request. Probably easily fixable

I'm looking for other cases, and I'm sure any language with diacritics will observe this quite serious issue.

Event Timeline

Change 994253 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/DiscussionTools@master] decodeURI fragments before sending them to discussiontoolsfindcomment

https://gerrit.wikimedia.org/r/994253

Change 994253 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] decodeURI fragments before sending them to discussiontoolsfindcomment

https://gerrit.wikimedia.org/r/994253

Change 994234 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.15] decodeURI fragments before sending them to discussiontoolsfindcomment

https://gerrit.wikimedia.org/r/994234

Change 994235 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.16] decodeURI fragments before sending them to discussiontoolsfindcomment

https://gerrit.wikimedia.org/r/994235

Change 994234 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.15] decodeURI fragments before sending them to discussiontoolsfindcomment

https://gerrit.wikimedia.org/r/994234

Change 994235 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.16] decodeURI fragments before sending them to discussiontoolsfindcomment

https://gerrit.wikimedia.org/r/994235

Mentioned in SAL (#wikimedia-operations) [2024-01-31T14:08:56Z] <urbanecm@deploy2002> Started scap: Backport for [[gerrit:994234|decodeURI fragments before sending them to discussiontoolsfindcomment (T356199)]], [[gerrit:994235|decodeURI fragments before sending them to discussiontoolsfindcomment (T356199)]], [[gerrit:994708|Add an exception for ConvenientDiscussions-style permalinks (T349653)]], [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)]

Mentioned in SAL (#wikimedia-operations) [2024-01-31T14:10:28Z] <urbanecm@deploy2002> urbanecm and kemayo and matmarex and daimona: Backport for [[gerrit:994234|decodeURI fragments before sending them to discussiontoolsfindcomment (T356199)]], [[gerrit:994235|decodeURI fragments before sending them to discussiontoolsfindcomment (T356199)]], [[gerrit:994708|Add an exception for ConvenientDiscussions-style permalinks (T349653)]], [[gerrit:994709|Add an exception for ConvenientDiscuss

Mentioned in SAL (#wikimedia-operations) [2024-01-31T14:19:27Z] <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:994234|decodeURI fragments before sending them to discussiontoolsfindcomment (T356199)]], [[gerrit:994235|decodeURI fragments before sending them to discussiontoolsfindcomment (T356199)]], [[gerrit:994708|Add an exception for ConvenientDiscussions-style permalinks (T349653)]], [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)

For Tech News:

Talk pages permalinks that included diacritics were malfunctioning. This has been fixed.

I think the Tech News entry should mention non-Latin-script languages as well, because they’ve had an even bigger problem: on Latin-script wikis like French or German, more or less of the links were broken, while on non-Latin-script ones like Chinese or Russian, basically all links.

Change 995272 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/DiscussionTools@master] Use decodeURI for comment ID searches as well as heading searches

https://gerrit.wikimedia.org/r/995272

The above patch fixes it for certain comment permalinks too. We will try and deploy it on Monday.

Change 995272 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Use decodeURI for comment ID searches as well as heading searches

https://gerrit.wikimedia.org/r/995272

Change 997279 had a related patch set uploaded (by DLynch; author: Esanders):

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.16] Use decodeURI for comment ID searches as well as heading searches

https://gerrit.wikimedia.org/r/997279

Change 997279 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@wmf/1.42.0-wmf.16] Use decodeURI for comment ID searches as well as heading searches

https://gerrit.wikimedia.org/r/997279

Mentioned in SAL (#wikimedia-operations) [2024-02-05T21:42:29Z] <cjming@deploy2002> Started scap: Backport for [[gerrit:997279|Use decodeURI for comment ID searches as well as heading searches (T356199)]]

Mentioned in SAL (#wikimedia-operations) [2024-02-05T21:43:49Z] <cjming@deploy2002> cjming and kemayo: Backport for [[gerrit:997279|Use decodeURI for comment ID searches as well as heading searches (T356199)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-02-05T21:55:38Z] <cjming@deploy2002> Finished scap: Backport for [[gerrit:997279|Use decodeURI for comment ID searches as well as heading searches (T356199)]] (duration: 13m 08s)