Contributed by Mathieu in #49300

Take a look at the two following domain names: "symfony.com" and "ѕymfony.com". The look similar, but they are not the same. In the second domain, the first letter is not s (the lowercase s letter in Latin script) but ѕ (a letter called dze in the Cyrillic script).

Using different but similarly looking characters is the base of IDN homograph attacks, a type of spoofing security attack. That's why it is recommended to check user-submitted, public-facing identifiers for suspicious characters in order to prevent such attacks.

However, given that Unicode defines more than 150,000 valid characters, this is a daunting task. For example, did you know that there are invisible characters such as zero-width spaces? And what about mixing 8 (digit eight in Latin script) and ৪ (digit four in Bengali script)? Don't forget either about combining characters, such as the "combining dot" that can be placed after the character i to make it invisible.

In Symfony 6.3, we're introducing a new NoSuspiciousCharacters constraint so you can validate that strings don't contain any of these problematic characters. It's based on the Spoofchecker class provided by the PHP intl extension and it works as follows:

        1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
        // src/Entity/User.php
namespace App\Entity;

use Symfony\Component\Validator\Constraints as Assert;

class User
{
    #[Assert\NoSuspiciousCharacters(
        // checks zero-width spaces and numbers looking the same (e.g. 8 and ৪)
        checks: NoSuspiciousCharacters::CHECK_INVISIBLE | NoSuspiciousCharacters::CHECK_MIXED_NUMBERS,
        restrictionLevel: NoSuspiciousCharacters::RESTRICTION_LEVEL_HIGH,
    )]
    private string $username;

    #[Assert\NoSuspiciousCharacters]
    private string $blogUrl;

    // ...
}
    

Read the NoSuspiciousCharacters constraint docs to learn more about its usage and options.

Published in #Living on the edge

Alessandro GIULIANI said on Apr 18, 2023 at 10:56

Great feature ! Is it possible to enable that constraint globally ? Or we have to implement it ourself (with some events)

Philippe Gamache said on Apr 18, 2023 at 15:22

Alessandro,

Don't use this constraint globally, only where those characters can cause problems, like in domain name or username. Those characters are legitimate. Some keyboard will used those automatically, for example in the Japanese keyboard, they can use the same numbers (0-9) but wont use the same unicode, as they need a version with more spaces between the number to be mixed with normal kanji.

Sometimes, you need more a transliterator (complete or number only) for that. A good example would be a phone number.

New in Symfony 6.3 NoSuspiciousCharacters Constraint

Comments

Become a Symfony contributor