Chroder
New Member
Introduction
This post is meant to teach you how to parse XML into workable data using PHP4. In this post, we are going to work with an XML file that defines the layout of a book. The book will contain the book details and chapters. Each chapter will hold information to which page number it is on and a small description. How to create an XML document is beyond the scope of this post, but be sure to check out W3 Schools.
The XML Document
Create a new XML file (that is, a file with a .xml extension) with the following contents. We will be using this XML file throughout this post to demonstrate.
The BookParser Class
Whenever I parse XML, I like to create a nice custom class. You could create some sort of base class and just extend off of it for different types of XML documents, but for this post we'll be creating just a simple BookParser. So here's the code, we'll talk about it after.
This post is meant to teach you how to parse XML into workable data using PHP4. In this post, we are going to work with an XML file that defines the layout of a book. The book will contain the book details and chapters. Each chapter will hold information to which page number it is on and a small description. How to create an XML document is beyond the scope of this post, but be sure to check out W3 Schools.
The XML Document
Create a new XML file (that is, a file with a .xml extension) with the following contents. We will be using this XML file throughout this post to demonstrate.
Code:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE book [
<!ELEMENT book (chapter+)>
<!ELEMENT chapter (desc+)>
<!ELEMENT desc (#PCDATA)>
<!ATTLIST book name CDATA "Unknown">
<!ATTLIST book author CDATA "Unknown">
<!ATTLIST book isbn CDATA "Unknown">
<!ATTLIST chapter name CDATA "Unknown">
<!ATTLIST chapter page CDATA "Unknown">
]>
<book name="The WindowsXP OS" author="Me" isbn="1234567890">
<chapter name="Introduction" page="1">
<desc>Book introduction</desc>
</chapter>
<chapter name="Who Should Read" page="3">
<desc>About who the book is aimed to</desc>
</chapter>
<chapter name="Getting It" page="5">
<desc>How to get the WindowsXP CD</desc>
</chapter>
<chapter name="Home or Pro" page="6">
<desc>Which one is best for you?</desc>
</chapter>
<chapter name="Installing" page="7">
<desc>The installation process</desc>
</chapter>
<chapter name="Making Users" page="10">
<desc>How to add users to your system</desc>
</chapter>
<chapter name="Viruses" page="13">
<desc>All about viruses and how to avoid them</desc>
</chapter>
<chapter name="Conclusion" page="15">
<desc>Good-byes and further reading</desc>
</chapter>
<chapter name="Terms" page="16">
<desc>Terms that you might not understand</desc>
</chapter>
<chapter name="Index" page="19">
<desc>Index</desc>
</chapter>
</book>
The BookParser Class
Whenever I parse XML, I like to create a nice custom class. You could create some sort of base class and just extend off of it for different types of XML documents, but for this post we'll be creating just a simple BookParser. So here's the code, we'll talk about it after.
PHP:
class BookParser
{
var $book;
var $xml;
var $chapter;
var $tag;
//==========================================================================
// BookParser
// ----------
// Constructor, set up parser
//==========================================================================
function BookParser()
{
$this->xml = xml_parser_create();
xml_set_object($this->xml, $this);
xml_set_element_handler($this->xml, 'elementStart', 'elementEnd');
xml_set_character_data_handler($this->xml, 'characterData');
xml_parser_set_option($this->xml, XML_OPTION_CASE_FOLDING, false);
}
//==========================================================================
// parse
// -----
// Parse the document
//==========================================================================
function parse($path)
{
$fh = fopen($path, 'r') or die('Cannot open file `' . $path . '`');
while($data = fread($fh, 4096))
{
xml_parse($this->xml, $data, feof($fh)) or die('Cannot parse file `' . $path . '`');
}
@fclose($fh);
}
//==========================================================================
// elementStart
// ------------
// Handle opening tag
//==========================================================================
function elementStart($parser, $tag, $attr)
{
if($tag == 'book')
{
$this->book = array('name' => $attr['name'], 'author' => $attr['author'], 'isbn' => $attr['isbn']);
$this->display();
}
elseif(empty($this->chapter) && $tag == 'chapter')
$this->chapter = new ItemChapter($attr['name'], $attr['page']);
else
$this->tag = $tag;
}
//==========================================================================
// elementEnd
// ----------
// Handle closing tag
//==========================================================================
function elementEnd($parser, $tag)
{
if($tag == 'chapter')
{
$this->chapter->display();
unset($this->chapter);
}
$this->tag = '';
}
//==========================================================================
// characterData
// -------------
// Handle element character data
//==========================================================================
function characterData($parser, $data)
{
if(!empty($this->chapter) && $this->tag == 'desc')
$this->chapter->setDesc($data);
}
//==========================================================================
// display
// -------
// Display book information
//==========================================================================
function display()
{
echo "<h1>{$this->book['name']}</h1>\n";
echo "By <strong>{$this->book['author']}</strong> <small><em>(ISBN: {$this->book['isbn']})</em></small><br /><br /><br />\n\n\n";
}
}
Continued ...
Last edited: