New in Symfony 5.1: URI Resolver
Warning: This post is about an unsupported Symfony version. Some of this information may be out of date. Read the most recent Symfony Docs.
Contributed by
Grégoire Pineau
in #35415
and #35667.
The DomCrawler component eases DOM navigation for HTML and XML documents. Most developers use it in the functional tests of their Symfony applications, but you can use it to build a real crawler.
A common need when building a crawler is to turn the links of the HTML contents,
which are usually relative, into absolute URLs, to keep crawling the entire site.
For example, if the site URL is https://example.com/foo
and the link URL is
../bar?foo=1
, the absolute URL is https://example.com/bar?foo=1
.
This transformation is much more complex than it looks because you have to deal
with anchors, query string parameters and all kinds of sub paths. The DomCrawler
component already contained the logic to resolve these URLs, but in Symfony 5.1
we've extracted it into a new UriResolver
class so you can reuse the logic
in your applications:
1 2 3 4
use Symfony\Component\DomCrawler\UriResolver;
$absoluteUrl = UriResolver::resolve('../bar?foo=1', 'https://example.com/foo');
// $absoluteUrl = 'https://example.com/bar?foo=1'
Help the Symfony project!
As with any Open-Source project, contributing code or documentation is the most common way to help, but we also have a wide range of sponsoring opportunities.
Comments
Comments are closed.
To ensure that comments stay relevant, they are closed for old posts.