Extracting data from an XML using PHP

I am needing help trying to extract data out of an XML using PHP.

I have the following simple page that is passed the following type of value for URI, http://www.google.com/calendar/feeds/rosminicollege%40gmail.com/public/full/agp7tsrj91d4gclr777v7bc46s

<?php
if (isset($_GET['uri'])) {
	$uri=$_GET['uri'];
	
	$xml=simplexml_load_file($uri);
	
	print_r($xml);	
	
	$title=$xml->title;
	$content=$xml->content;
	
} else
 header( 'Location: http://www.rosmini.school.nz/facebook' ) ;
?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Event Details</title>
</head>

<body>
</body>
</html>

I have no problem getting details like title and content but what I can’t get is the start and end times which are buried in the fetch XML in <gd:when endTime=‘2011-10-26’ startTime=‘2011-10-25’/>

I want to be able to do something like

$start=.........
$end=.......

But when I look at the output of simplexml_load_file($uri), I can’t see these values anywhere.

What I prefer to do is a little hack to convert an xml payload to an array using json_encode/decode trick.


print_r(json_decode(json_encode(simplexml_load_string($xmlstring)), true));

Thanks for that I now have

<?php
if (isset($_GET['uri'])) {
	$uri=$_GET['uri'];
	
	$xml=simplexml_load_file($uri);
	
	print_r(json_decode(json_encode(simplexml_load_string($uri)), true));
	
} ?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Event Details</title>
</head>

<body>
</body>
</html>

Just wanting to see what it generates, but I get:

Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 1: parser error : Start tag expected, ‘<’ not found in /home/rosminis/public_html/facebook/test.php on line 7

Warning: simplexml_load_string() [function.simplexml-load-string]: http://www.google.com/calendar/feeds/rosminicollege@gmail.com/public/full/agp7ts in /home/rosminis/public_html/facebook/test.php on line 7

Warning: simplexml_load_string() [function.simplexml-load-string]: ^ in /home/rosminis/public_html/facebook/test.php on line 7

I’m unfamiliar with Google calendar feeds, but the link doesn’t give an XML feed, it gives

<html><head><meta http-equiv="content-type" content="text/html;charset=UTF-8">
<title>Error</title>
<style type="text/css">body {font-family: arial,sans-serif}</style></head>
<body text="#000000" bgcolor="#ffffff"><table border="0" cellpadding="2" cellspacing="0" width="100%"><tr><td rowspan="3" width="1%" nowrap><b><font face="times" size="10"><font color="#0039b6">G</font> <font color="#c41200">o</font> <font color="#f3c518">o</font> <font color="#0039b6">g</font> <font color="#30a72f">l</font> <font color="#c41200">e</font></font>&nbsp;&nbsp;</b></td>
<td>&nbsp;</td></tr>
<tr><td bgcolor="#3366cc"><font face="arial,sans-serif" color="#ffffff"><b>Error</b></font></td></tr>
<tr><td>&nbsp;</td></tr></table>
<blockquote>Cannot access the calendar you requested</blockquote>
<p></p>
<div style="background:#3366cc; width:1px; height:4px"></div></body></html>

Do you need to be logged in? Maybe the URL is wrong?

Hi, the google feed is

http://www.google.com/calendar/feeds/rosminicollege%40gmail.com/public/full/agp7tsrj91d4gclr777v7bc46s

Clicking on one of the event links on the left of http://www.rosmini.school.nz/facebook/ will open the following code

<?php
if (isset($_GET['uri'])) {
	$uri=$_GET['uri'];
	
	$xml=simplexml_load_file($uri);
	
	print_r(json_decode(json_encode(simplexml_load_string($uri)), true));
	
	print_r($xml);
	
} ?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Event Details</title>
</head>
<body>
</body>
</html>

Which will show you what is outputted w.r.t.

$xml=simplexml_load_file($uri); and print_r(json_decode(json_encode(simplexml_load_string($uri)), true));

Do use a favor and give us the results of a print on $uri please.

print $uri;
exit;

Making progress, that link gives

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns='http://www.w3.org/2005/Atom'
		xmlns:gCal='http://schemas.google.com/gCal/2005'
		xmlns:gd='http://schemas.google.com/g/2005'>
<id>http://www.google.com/calendar/feeds/rosminicollege%40gmail.com/public/full/agp7tsrj91d4gclr777v7bc46s</id>
<published>2011-10-18T03:42:18.000Z</published>
<updated>2011-10-18T03:43:08.000Z</updated>
<category scheme='http://schemas.google.com/g/2005#kind'
			term='http://schemas.google.com/g/2005#event'/>
<title type='text'>Term four commences</title>
<content type='text'/>
<link rel='alternate' type='text/html' href='http://www.google.com/calendar/event?eid=YWdwN3Rzcmo5MWQ0Z2Nscjc3N3Y3YmM0NnMgcm9zbWluaWNvbGxlZ2VAbQ' title='alternate'/>
<link rel='self' type='application/atom+xml' href='http://www.google.com/calendar/feeds/rosminicollege%40gmail.com/public/full/agp7tsrj91d4gclr777v7bc46s'/>
<author>
	<name>Rosmini College</name>
	<email>rosminicollege@gmail.com</email>
</author>
<gd:comments>
	<gd:feedLink href='http://www.google.com/calendar/feeds/rosminicollege%40gmail.com/public/full/agp7tsrj91d4gclr777v7bc46s/comments'/>
</gd:comments>
<gd:eventStatus value='http://schemas.google.com/g/2005#event.confirmed'/>
<gd:where valueString=''/>
<gd:who email='rosminicollege@gmail.com' rel='http://schemas.google.com/g/2005#event.organizer' valueString='Rosmini College'/>
<gd:when endTime='2011-10-26' startTime='2011-10-25'/>
<gd:transparency value='http://schemas.google.com/g/2005#event.transparent'/>
<gCal:anyoneCanAddSelf value='false'/>
<gCal:guestsCanInviteOthers value='true'/>
<gCal:guestsCanModify value='false'/>
<gCal:guestsCanSeeGuests value='true'/>
<gCal:sequence value='1'/>
<gCal:uid value='agp7tsrj91d4gclr777v7bc46s@google.com'/>
</entry>

but what you want involves namespaces, A little trickier.
I’ll try when I get a chance, but until then maybe this will help enough so you can figure it out http://www.sitepoint.com/simplexml-and-namespaces/

If you haven’t looked at the SitePoint article previously linked to yet please do.

$sxe = new SimpleXMLElement($xmlstr);
$namespaces = $sxe->getDocNamespaces();
/*
array
  '' => string 'http://www.w3.org/2005/Atom' (length=27)
  'gCal' => string 'http://schemas.google.com/gCal/2005' (length=35)
  'gd' => string 'http://schemas.google.com/g/2005' (length=32)
*/
foreach ($namespaces as $localpart => $ns_uri) {
	if ($localpart != "") {
		$ns_obj = $sxe->children($ns_uri);
		foreach ($ns_obj as $ns_tag) {
		print_r($localpart . ":" . $ns_tag->getName() . "<br/>");
			$ns_obj_attrs = $ns_tag->attributes();
			foreach ($ns_obj_attrs as $attr_name => $attr_val) {
				print_r("  " . $attr_name . "  " . $attr_val . "<br/>");
			}
			echo "<br/>";
		}
	}
}

I’m certain this example code can be improved. A lot depends on what you need to get at.
If you know the "localpart"s and "tagname"s beforehand it should be a bit simpler as you could do away with the [fphp]getDocNamespaces/fphp stuff.
If you think the "localpart"s might change you could use [fphp]registerXPathNamespace/fphp
If you need deeper nested tags you can add more loops or write functions to handle iterations.
You can add conditional if tests to get only what you need.
And of course you’ll want to do something other than print_r()
etc. etc. Anyway, hopefully it’s helpful.