Affected versions
Symfony versions >=6.1, <6.4.41, >=7, <7.4.13, >=8, <8.0.13 of the Symfony HTML Sanitizer component are affected by this security issue.
The issue has been fixed in Symfony 6.4.41, 7.4.13, 8.0.13.
Description
Symfony
rejects URLs containing raw Unicode explicit-direction BiDi formatting
characters (U+202A–U+202E, U+2066–U+2069) as a defense against
visual-spoofing of the rendered href. The check covers only the raw
UTF-8 forms of those code points: the percent-encoded forms
(%E2%80%AE for U+202E, %E2%81%A6 for U+2066, etc.) are not matched
by the deny regex, survive league/uri's parse/build cycle, and are
re-emitted unchanged in the sanitized URL. Any downstream consumer that
decodes the link before display, phishing-detection filters that compare
urldecode($href) against a domain allow-list, audit-log dashboards
that show a decoded form for readability, hover-tooltip previews,
federated/syndicated content where the decoder lives on the consuming
side, restores the BiDi character and the visual spoof that the original
defense was filed to prevent.
The same UrlSanitizer::parse() carries an ASCII-only /\s/
whitespace check (no /u modifier) intended as a backstop against
malformed URLs. Without the /u modifier, PCRE's \s matches only
ASCII whitespace, so Unicode whitespace characters, NBSP (U+00A0), the
zero-width no-break space / BOM (U+FEFF), line/paragraph separators
(U+2028, U+2029), ogham space (U+1680), the U+2000–U+200A en/em quad
family, narrow / medium / ideographic spaces (U+202F, U+205F, U+3000) and
NEL (U+0085), pass through unchanged in both raw and percent-encoded
forms. In hostname positions they enable lookalike spoofs
(example<NBSP>.com); in path/query/fragment they enable allow-list
drift when a downstream consumer strips whitespace before comparison.
Resolution
UrlSanitizer::parse() now denies BiDi formatting marks together with
Unicode whitespace and the zero-width no-break space, in both the raw
input and the percent-decoded form of each parsed URL component
(user, pass, host, path, query, fragment). ASCII
space remains tolerated in path/query/fragment via the existing
percent-encoding step.
The patches for this issue are available here for branch 6.4 (and forward-ported to 7.4, 8.0 and 8.1).
Credits
We would like to thank Scott Arciszewski for discovering the issue and Nicolas Grekas for providing the fix.