Introduction

If you’re a PHP developer, or aspiring to be one, at some point you’ll need to parse XML. There’s a wide range of applications and use cases for parsing XML, and some of the more common scenarios are REST API calls, and screen scraping (a type of data mining). Today we’ll cover the basics of XML Parsing using the DomDocument PHP object.

Let’s get started.

The Setup

First, we need to create an empty directory with the files index.php and book.xml.

After those files are created we’re going to open our empty book.xml file and paste the following data into it:

We’re going to be using the small XML file above to test our parsing abilities in PHP. That’s it for the setup. Feel free to use your own XML file if you please.

The DomDocument Object

In PHP 5.xx we have a special object called the DomDocument object and it’s designed to help us parse HTML and XML very easily.

Let’s start by initializing an instance of DomDocument and calling our first method within index.php:

Above we have initialized an instance of DomDocument and set the property of validateOnParse to true. We have done this to ensure that our xml is valid xml when we parse it, if it’s not we’ll receive an error. In the last line of code we’ve written, we’re calling the load method on our object instance, and passing in the parameter ‘book.xml’. This will load and validate our xml file into the object so that we can manipulate the data further by calling more methods.

Let’s take a look:

Let’s evaluate the aforementioned code. First, we’re calling the getElementsByTagName method on our object instance of DomDocument. This specific method enables us to locate tags within our xml data based on the parameter we’ve passed into it. Since we’ve passed in the parameter book our method will find all of the book tags within the xml data.

Next, we create a foreach loop to cycle through each book tag and echo out the value of the id attribute. In this specific file, the id attribute reveals the book’s title, so that’s valuable information to turn into a list.

When we run our code you’ll see that the book titles are echoed out one after the other. This code would work on any number of books in the file as long as the tag structure stays the same.

Now that you’ve learned the basics of parsing XML in PHP with the DomDocument you should try a few parsing projects of your own. Remember, practice makes perfect.