Symfony
sponsored by SensioLabs
Menu
  • About
  • Documentation
  • Screencasts
  • Cloud
  • Certification
  • Community
  • Businesses
  • News
  • Download
  1. Home
  2. Documentation
  3. Testing
  4. The DOM Crawler
  • Documentation
  • Book
  • Reference
  • Bundles
  • Cloud
Search by Algolia

Table of Contents

  • Traversing
  • Extracting Information

The DOM Crawler

Edit this page

The DOM Crawler

A Crawler instance is returned each time you make a request with the Client. It allows you to traverse HTML or XML documents: select nodes, find links and forms, and retrieve attributes or contents.

Traversing

Like jQuery, the Crawler has methods to traverse the DOM of an HTML/XML document. For example, the following finds all input[type=submit] elements, selects the last one on the page, and then selects its immediate parent element:

1
2
3
4
5
$newCrawler = $crawler->filter('input[type=submit]')
    ->last()
    ->parents()
    ->first()
;

Many other methods are also available:

filter('h1.title')
Nodes that match the CSS selector.
filterXpath('h1')
Nodes that match the XPath expression.
eq(1)
Node for the specified index.
first()
First node.
last()
Last node.
siblings()
Siblings.
nextAll()
All following siblings.
previousAll()
All preceding siblings.
parents()
Returns the parent nodes.
children()
Returns children nodes.
reduce($lambda)
Nodes for which the callable does not return false.

Since each of these methods returns a new Crawler instance, you can narrow down your node selection by chaining the method calls:

1
2
3
4
5
6
7
8
9
$crawler
    ->filter('h1')
    ->reduce(function ($node, $i) {
        if (!$node->attr('class')) {
            return false;
        }
    })
    ->first()
;

Tip

Use the count() function to get the number of nodes stored in a Crawler: count($crawler)

Extracting Information

The Crawler can extract information from the nodes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// returns the attribute value for the first node
$crawler->attr('class');

// returns the node value for the first node
$crawler->text();

// returns the default text if the node does not exist
$crawler->text('Default text content');

// pass TRUE as the second argument of text() to remove all extra white spaces, including
// the internal ones (e.g. "  foo\n  bar    baz \n " is returned as "foo bar baz")
$crawler->text(null, true);

// extracts an array of attributes for all nodes
// (_text returns the node value)
// returns an array for each element in crawler,
// each with the value and href
$info = $crawler->extract(['_text', 'href']);

// executes a lambda for each node and return an array of results
$data = $crawler->each(function ($node, $i) {
    return $node->attr('href');
});
This work, including the code samples, is licensed under a Creative Commons BY-SA 3.0 license.
We stand with Ukraine.
Version:

Symfony 5.4 is backed by

Online exam, become Symfony certified today

Online exam, become Symfony certified today

Check Code Performance in Dev, Test, Staging & Production

Check Code Performance in Dev, Test, Staging & Production

↓ Our footer now uses the colors of the Ukrainian flag because Symfony stands with the people of Ukraine.

Avatar of Julien Moulin, a Symfony contributor

Thanks Julien Moulin (@lizjulien) for being a Symfony contributor

1 commit • 7 lines changed

View all contributors that help us make Symfony

Become a Symfony contributor

Be an active part of the community and contribute ideas, code and bug fixes. Both experts and newcomers are welcome.

Learn how to contribute

Symfony™ is a trademark of Symfony SAS. All rights reserved.

  • What is Symfony?
    • Symfony at a Glance
    • Symfony Components
    • Case Studies
    • Symfony Releases
    • Security Policy
    • Logo & Screenshots
    • Trademark & Licenses
    • symfony1 Legacy
  • Learn Symfony
    • Symfony Docs
    • Symfony Book
    • Reference
    • Bundles
    • Best Practices
    • Training
    • eLearning Platform
    • Certification
  • Screencasts
    • Learn Symfony
    • Learn PHP
    • Learn JavaScript
    • Learn Drupal
    • Learn RESTful APIs
  • Community
    • SymfonyConnect
    • Support
    • How to be Involved
    • Code of Conduct
    • Events & Meetups
    • Projects using Symfony
    • Downloads Stats
    • Contributors
    • Backers
  • Blog
    • Events & Meetups
    • A week of symfony
    • Case studies
    • Cloud
    • Community
    • Conferences
    • Diversity
    • Documentation
    • Living on the edge
    • Releases
    • Security Advisories
    • SymfonyInsight
    • Twig
    • SensioLabs
  • Services
    • SensioLabs services
    • Train developers
    • Manage your project quality
    • Improve your project performance
    • Host Symfony projects
    Deployed on
Follow Symfony
Search by Algolia