SimpleXML and CDATA

Suppose I am trying to read a XML feed with the following code:


<item>
<title>News Title</title>
<link>http://www.newslink.com/newstory1</link>
<description>
<![CDATA[
<p>
<a href="http://www.newslink.com/newstory1">
<img src="http://www.imageurl.com" />
</a>
News story description
</p>
]]>
</description>
<media:content url="http://www.imageurl.com">

The following is the PHP code used to read the XML file.


$xml = simplexml_load_file($xml_url);
$namespaces = $xml->getNamespaces(true); // get namespaces

foreach ($xml->channel->item as $item) {

  echo "Link: ". $item->link ."<br>";
  echo "Title: ". $item->title ."<br>";
  echo "Description: ". $item->description ."<br>";

  $thumbnail = @$item->children($namespaces['media'])->content->attributes()->url;

  if ($thumbnail != "") {
    echo "Media: ". $thumbnail  ."<br>";
  }

}

In the description tag, how do I go about just retrieving the text section (“News story description”) from within CDATA? I am not after the image tag or the link, which I can get from the media:content tag.

I had a quick look and it appears like xpath is one way to do it (http://stackoverflow.com/questions/568315/how-do-i-retrieve-element-text-inside-cdata-markup-via-xpath/568394#568394) but I am unclear of how to use xpath in PHP.

Thought about this some more.

The text section of the CDATA in the description is surrounded by HTML.

Rather than using a regular expression / xpath to get just the text, strip_tags() will remove the HTML and leave the text

Easy!