So far, we have looked at XML, Web Services and AJAX, and have connected to our web service from a JavaScript-based AJAX application. In this topic we are looking at how we can connect to a remote web service (provided by someone else) from our own website. A real-world example of this might be an airline booking site such as Expedia contacting web services provided by different airlines, parsing the XML returned and integrating the data in its own website.
There are two questions we need to ask here:
$connection = curl_init(); curl_setopt($connection, CURLOPT_URL, "http://remoteserver/script.php"); curl_setopt($connection,CURLOPT_RETURNTRANSFER,1); curl_setopt($connection,CURLOPT_HEADER, 0); $response = curl_exec($connection);This code makes a connection to a given remote server (here, http://remoteserver/script.php) and the response sent back is stored in $response. If the remote script sends back XML, $response will contain XML. If the remote script sends back an HTML page, $response will contain HTML. This is standard code that can be copied and pasted every time you want to make a remote connection just change the URL.
Having obtained the XML returned from the server, it is stored in $response. The next thing we need to do is parse (interpret) it. In PHP, one can use the DOM, just as for JavaScript. One disadvantage of the DOM is that it loads the whole XML document into memory, so if a lot of data is sent back from the server, this may not be the best approach. An alternative is SAX - the Simple API for XML. With SAX, each line of the XML document is loaded into memory one at a time, and code is written to process each line as it comes in. This means that SAX is more complex to program compared to DOM, but significantly more memory-efficient.
Here, we will be considering SimpleXML and SAX. SimpleXML is a simple XML parsing library which is part of PHP 5. It is similar in approach to DOM - i.e. the whole of the document is loaded into memory - but offers a simpler programming interface. However, it is PHP-specific, while the DOM is a general programming interface which can be used within a range of programming languages, and SAX-like approaches are also used by a wide range of languages. When doing the exercises, I would recommend that you use SAX if you believe your programming ability is intermediate or strong, but SimpleXML if you find programming difficult. This is because SAX is more memory-efficient but trickier to program, but SimpleXML is quite easy to program but less memory-efficient and also is PHP-specific.
Imagine we have this XML below, stored in a file called students.xml:
<students> <student> <name> Mark Gill </name> <username>9gillm69</username> <phone>07111 111111</phone> </student> <student> <name> Steve Mills </name> <username>1mills63</username> <phone>07222 222222</phone> </student> <student> <name> Rob Price </name> <username>5pricr67</username> <phone>07333 333333</phone> </student> </students>We could parse the XML as follows:
<?php
$xml = simplexml_load_file("students.xml");
for($index=0; $index < count($xml->student); $index++)
{
echo $xml->student[$index]->name . "<br />";
echo $xml->student[$index]->username . "<br />";
echo $xml->student[$index]->phone . "<br />";
}
?>
Note how we read in the XML file with simplexml_load_file. We reference individual tags with the -> symbol, e.g. $xml->student is a collection of all the student tags within the XML. We then use a "for" loop to loop through each student in the XML as follows:
$xml->student[$index]->namereferences the text within the name tag within the student tag at position $index in the array
Please note, this is a more advanced topic, intended only for those of you who are comfortable with programming. Please look at SimpleXML, instead, if you are less so.
SAX (Simple API for XML - this is different to SimpleXML) is a Java library for parsing XML, however similar approaches are adopted for other languages. The key feature of SAX-like approaches are that they are event driven parsers.
What do we mean by that? The XML is read one line at a time. Each time a line is read, an event - such as encountering an opening tag, encountering a closing tag, and encountering the text between the opening and closing tag - might occur. The idea is to write a piece of code to react to each of these events occurring. So we could write one piece of code to react to encountering an opening tag, another to react to encountering a closing tag, and yet another to react to encountering the text in between an opening and closing tag.
One of the things that the code to react to opening and closing tags must do is determine which tag we are within, as we are likely to want to do different things with the data depending on what tag we are within.
This is illustrated by the diagram below.

<?php
// Set up our variables.
$data = array();
$currentTag = null;
// Initialise the XML to parse. In a real application, this would be read
// from the web or a file.
$xml ="<students>".
"<student><name>Rob Stevenson</name>" .
"<course>Computer Network Management</course></student>".
"<student><name>Jamie Bailey</name>".
"<course>Computer Studies</course></student>".
"</students>";
// Parse the XML.
$parser=xml_parser_create();
xml_set_element_handler($parser, "foundAnOpeningTag","foundAClosingTag");
xml_set_character_data_handler($parser, "foundSomeText");
xml_parse($parser,$xml);
xml_parser_free($parser);
// Print out the data read in.
for($count=0; $count < count($data["name"]); $count++)
{
echo "Name ". $data["name"][$count]. " Course ". $data["course"][$count]. "<br/>";
}
// Function to handle opening tags.
function foundAnOpeningTag($parser,$tag,$attributes)
{
global $currentTag;
$currentTag = strtolower($tag);
}
// Function to handle closing tags.
function foundAClosingTag($parser,$tag)
{
global $currentTag;
$currentTag = null;
}
// Function to handle characters within tags.
function foundSomeText($parser,$characters)
{
global $data, $currentTag;
$data[$currentTag][] = $characters;
}
?>
This is one possible way of using SAX to extract data from XML. In this example, we extract all the data, however, we could alter the code to extract only selected data.
First we set up an array ($data) to store all the data read in from the XML. We also set up a variable $currentTag to represent the current tag we are inside, and set it initially to null to indicate that we are currently not inside any tag.
<?php $data=array(); $currentTag=null;
Then we initialise our XML and store it in the variable $xml. This XML would normally be read in from the web, using cURL (see above), or from a file. However here, for illustration purposes, I have just stored the XML in a variable, $xml.
$xml ="<students>".
"<student><name>Rob Stevenson</name>" .
"<course>Computer Network Management</course></student>".
"<student><name>Jamie Bailey</name>".
"<course>Computer Studies</course></student>".
"</students>";
Next we initialise the parser.
$parser=xml_parser_create();We then tell the parser the names of the functions which handle opening and closing tags, and the text between tags, so that when the parser encounters a tag or some text, it knows where in the code to jump to:
xml_set_element_handler($parser, "foundAnOpeningTag","foundAClosingTag"); xml_set_character_data_handler($parser, "foundSomeText");The code above is saying that:
We then start the parsing with xml_parse. This process will call the three functions foundAnOpeningTag, foundAClosingTag and foundSomeText whenever it encounters an opening tag, a closing tag and some text between tags respectively. These functions are explained in more detail below.
xml_parse($parser,$xml); xml_parser_free($parser);When we get to this point, our XML will have been parsed and our variabble $data will contain the data. So we can loop through the data array and write the data out; I will explain how this works below, after the three event-handling functions have been considered.
Now I'll explain the actual event handling code. Firstly here is foundAnOpeningTag, the function which runs when an opening tag is encountered.
function foundAnOpeningTag($parser,$tag,$attributes)
{
global $currentTag;
$currentTag = strtolower($tag);
}
How does this work? The variable $tag will contain the current tag.
So we store it inside the global variable $currentTag, converting it to lower case
first (one of the shortcomings of the PHP SAX-style parser is that it automatically converts
tags to upper case - even if they are lower!) We will use the variable $currentTag
in the character handling function, below.
Note also the global declaration of $currentTag. This is because the variable is declared outside of the function, in the "global" area of the script. In order for the function to use them, we have to declare them as global.
function foundAClosingTag($parser,$tag)
{
global $currentTag;
$currentTag = null;
}
foundAClosingTag sets the $currentTag variable to null, to indicate that
we are no longer inside that tag.
function foundSomeText($parser,$characters)
{
global $data, $currentTag;
$data[$currentTag][] = $characters;
}
Remember from foundAnOpeningTag(), above, that the global variable
$currentTag contains the current tag we are inside. Our aim is to add the
text that we have found to the appropriate array. The variable $data will contain
all the data parsed from the XML. $data is an associative array of arrays.
(An associative array is an array which can be indexed using non-numerical
indices, such as strings).
So $data["name"] will be an array of all the values contained within
the <name> tags, and $data["course"] will be an array of all the values
contained within the <course> tags.
So, we add the text ($characters) to the appropriate array, using the value of
$currentTag. This is done with the line:
$data[$currentTag][] = $characters;The [] after $data[$currentTag] means "add a new element on to the end of the array", so in this case we will add $characters on to the end of the appropriate sub-array of $data. That's our complete parsing code!
Now you understand how the parsing works, we can return to the code which actually prints out the results. This is here:
for($count=0; $count < count($data["name"]); $count++)
{
echo "Name ". $data["name"][$count]. " Course ". $data["course"][$count]. "<br/>";
}
As already discussed, $data is an array of arrays, each sub-array containing
each aspect of the data, for example $data["name"] is an array of all the names and
$data["course"] is an array of all the courses extracted from the XML. So we loop from 0
to the number of elements in one of the sub-arrays ($data["name"] has been picked, it
doesn't matter, they will all have the same length) and display each member each
sub-array in turn.