New in Symfony 3.2: Unicode routing support

Contributed by
Nicolas Grekas
in #19604.

In Symfony 3.2, the routing component has been improved to add support for UTF-8 characters in route paths and requirements. Thanks to the new utf8 route option, you can make Symfony match and generate routes with UTF-8 characters:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
use Sensio\Bundle\FrameworkExtraBundle\Configuration\Route;

/**
 * @Route(
 *     "/category/{name}",
 *     "requirements" = { "name": ".+" },
 *     "options" = { "utf8": true }
 * )
 */
public function categoryAction($name)
{
    // ...
}

In this route, the utf8 option set to true makes Symfony consider the . requirement to match any UTF-8 characters instead of just a single byte character, so the following URLs would match: /category/日本語, /category/فارسی, /category/한국어, etc. In case you are wondering, this option also allows to include and match emojis in URLs.

In Symfony 3.2 there is no need to set this utf8 explicitly. As soon as Symfony finds a UTF-8 character in the route path or requirements, it will turn the UTF-8 support automatically:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
/**
 * 'utf8' is set to 'true' automatically because of the
 * contents of the 'name'  requirements:
 *
 * @Route(
 *     "/category/{name}",
 *     "requirements" = { "name": "日本語|فارسی" }
 * )
 */
public function categoryAction($name)
{
    // ...
}

However, to make things explicit again, this behavior is deprecated and it will result in a LogicException in Symfony 4.0. Therefore, remember to define the utf8 option explicitly for any route that may need it.

In addition to UTF-8 characters, the Routing component also supports all the PCRE Unicode properties, which are escape sequences that match generic character types. For example, \p{Lu} matches any uppercase character in any language, \p{Greek} matches any Greek character, \P{Han} matches any character not included in the Chinese Han script, etc.

Comments

Hi, thank you for this feature.

Sounds idiot, but can you confirm we can also use an utf8 string as a default value ?

Mickaël
@Mickaël, default values can be anything, so yes you can. Note that default values are not used to match routes, so they do not specially required the utf8 option.
Nice one! Thank for the support of UTF-8.

Comments are closed.

To ensure that comments stay relevant, they are closed for old posts.