Berat Doğan
Contributed by Berat Doğan in #10699

The DomCrawler component eases DOM navigation for HTML and XML documents. Although it's commonly used for developing functional tests in Symfony2 applications, it can also be used to scrape contents, as demonstrated by the Goutte project.

DomCrawler provides several methods to perform node filtering: filter(), reduce() and each(). As of Symfony 2.6, you can use another handy method called slice().

Similarly to the array_slice() PHP function, the new slice($offset, $length) function returns the sequence of elements specified by the offset and length parameters. Consider for example the code needed to extract the text content of some <li> elements from a #nav-menu element:

1
2
3
4
5
6
7
8
9
use Symfony\Component\DomCrawler\Crawler;

$crawler = new Crawler(' ... some HTML content ... ');

$crawler->filter('#nav-menu li')->each(function ($node, $i) {
    if ($i >= 2 && $i <= 7) {
        return $node->text();
    }
});

In Symfony 2.6, the previous node filtering code becomes more more simplified and cleaner:

1
2
3
$crawler->filter('#nav-menu li')->slice(2, 7)->each(function ($node, $i) {
    return $node->text();
});

This new Symfony 2.6 feature is just another example of how minor tweaks can make your job easier. We are strongly committed to improving each and every Symfony feature. That's why we've introduced the DX initiative. Help us continue improving Symfony by sending us your comments and ideas. Your opinion matters to us!

Published in #Living on the edge