Skip to content

bug: linkify with entities inside anchor strings are incorrectly escaped #704

@mcleanmds

Description

@mcleanmds

Describe the bug

linkify on a string with entities inside anchor element text results in the & character of the entity being incorrect escaped to &
e.g.   ->  

** python and bleach versions (please complete the following information):**

  • Python Version: 3.9.5
  • Bleach Version: 6.0.0

To Reproduce

A simple test to reproduce the behavior:

>>> from bleach import linkify
text = r'<p><a href="/">Some&nbsp;entity&rsquo;s</a>More&nbsp;entity&rsquo;s</p>'
expected = r'<p><a href="/" rel="nofollow">Some&nbsp;entity&rsquo;s</a>More&nbsp;entity&rsquo;s</p>'
assert linkify(text) == expected 

Expected behavior

linkify(r'<a href="/">Some&nbsp;entity&rsquo;s</a>')
'<a href="/" rel="nofollow">Some&nbsp;entity&rsquo;s</a>'

Actual behavior

linkify(r'<a href="/">Some&nbsp;entity&rsquo;s</a>')
'<a href="/" rel="nofollow">Some&amp;nbsp;entity&amp;rsquo;s</a>'

Additional context

This bug was introduced in 6.0.0 with the fix for #501 and #692: #692

Metadata

Metadata

Assignees

No one assigned

    Labels

    untriagedBug reports that haven't been triaged

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions