When crawling an XML document with the DomCrawler component, you might retrieve documents with more than one namespaces:
Note
The DomCrawler component is used by the Symfony HTTP client, but also by some Behat drivers.
1 2 3 4 5 6 7 8 9 10 11 12 13
<?xml version="1.0" encoding="UTF-8"?>
<entry
xmlns="http://www.w3.org/2005/Atom"
xmlns:media="http://search.yahoo.com/mrss/"
xmlns:yt="http://gdata.youtube.com/schemas/2007">
<id>tag:youtube.com,2008:video:kgZRZmEc9j4</id>
<yt:accessControl action="comment" permission="allowed"/>
<yt:accessControl action="videoRespond" permission="moderated"/>
<media:group>
<media:title type="plain">Chordates - CrashCourse Biology #24</media:title>
<yt:aspectRatio>widescreen</yt:aspectRatio>
</media:group>
</entry>
As of Symfony 2.4, you don't need to care about namespaces, as they auto-discovered and auto-registered:
1 2 3 4
$crawler = $crawler->filterXPath('//default:entry/media:group//yt:aspectRatio');
\Symfony\Component\CssSelector\CssSelector::disableHtmlExtension();
$crawler = $crawler->filter('default|entry media|group yt|aspectRatio');
Notice that the default namespace name is default
(configurable) and that
you must explicitly disable the HTML extension of the CssSelector component
when filtering an XML document with a CSS selector.
Typo in post title - s/DowCrawler/DomCrawler/g
There is an issue here: it is not used by some Behat drivers (there is no Behat drivers) but by some Mink drivers