Skip navigation

There’s XML file like below and you want to “extract” data from that file:

<?xml version="1.0"?>
<company name="Great Company, Inc">
	<address>Pearl Plaza Great Kuningan, Jakarta Indonesia</address>
	<employee>
		<firstname>Gardiary</firstname>
		<lastname>Rukhiat</lastname>
		<nickname>gardiary</nickname>
		<salary>3400000</salary>
	</employee>
	<employee>
		<firstname>Zinedine</firstname>
		<lastname>Zidane</lastname>
		<nickname>zidane</nickname>
		<salary>2300000</salary>
	</employee>
	<employee>
		<firstname>Justin</firstname>
		<lastname>Bieber</lastname>
		<nickname>jb</nickname>
		<salary>5000000</salary>
	</employee>
</company>

We are going to parse the XML file using SAX Parser. I’m using Java “org.xml.sax.*” package, so we dont need other library. Using SAX Parser basically we need to make a handler object that extends org.xml.sax.helpers.DefaultHandler class and override few methods. I’m only going to override methods: startElement(),  characters(), endElement().

SAX Parser is reading XML from start to the end of the document, everytime he find an events (e.g tag, text, etc) he will call our overriding methods in this direction: startElement() –> characters() –> endElement(). When the parser found opening tag (“<some_tag>“) it will call startElement(), then it will go to characters() so we can get data/text inside the tag (“<some_tag>Some text“), then if they found closing tag (“<some_tag>Some text</some_tag>“) it will call endElement(). It’s up to you what do you want to do in every events/methods.

In the sample code that you can get from 4shared and ziddu, I provide two type of handler object:  simple one that only output every text according to the methods call and other handler that work like xml-to-object scenario.

Simple DefaultHandler

package com.sample.sax;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class CompanySimpleHandler extends DefaultHandler {
 private final String COMPANY_TAG = "COMPANY";
 private final String ADDRESS_TAG = "ADDRESS";
 private final String EMPLOYEE_TAG = "EMPLOYEE";
 private final String FIRSTNAME_TAG = "FIRSTNAME";
 private final String LASTNAME_TAG = "LASTNAME";
 private final String NICKNAME_TAG = "NICKNAME";
 private final String SALARY_TAG = "SALARY";
 private final String NAME_ATTRIBUTE = "name";

 private boolean bAddress = false;
 private boolean bFirstname = false;
 private boolean bLastname = false;
 private boolean bNickname = false;
 private boolean bSalary = false;

 @Override
 public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
 System.out.print("[" + qName + "]");

 if(qName.equalsIgnoreCase(COMPANY_TAG)) {
 bCompany = true;
 System.out.println("[name]" + attributes.getValue(NAME_ATTRIBUTE) + "[/name]");
 }

 if(qName.equalsIgnoreCase(ADDRESS_TAG)) {
 bAddress = true;
 }

 if(qName.equalsIgnoreCase(EMPLOYEE_TAG)) {
 bEmployee = true;
 }

 if(qName.equalsIgnoreCase(FIRSTNAME_TAG)) {
 bFirstname = true;
 }

 if(qName.equalsIgnoreCase(LASTNAME_TAG)) {
 bLastname = true;
 }

 if(qName.equalsIgnoreCase(NICKNAME_TAG)) {
 bNickname = true;
 }

 if(qName.equalsIgnoreCase(SALARY_TAG)) {
 bSalary = true;
 }
 }

 @Override
 public void endElement(String uri, String localName, String qName) throws SAXException {
 System.out.println("[/" + qName + "]");

 if(qName.equalsIgnoreCase(ADDRESS_TAG)) {
 bAddress = false;
 }

 if(qName.equalsIgnoreCase(SALARY_TAG)) {
 bSalary = false;
 }

 if(qName.equalsIgnoreCase(NICKNAME_TAG)) {
 bNickname = false;
 }

 if(qName.equalsIgnoreCase(LASTNAME_TAG)) {
 bLastname = false;
 }

 if(qName.equalsIgnoreCase(FIRSTNAME_TAG)) {
 bFirstname = false;
 }

 if(qName.equalsIgnoreCase(EMPLOYEE_TAG)) {
 bEmployee = false;
 }

 if(qName.equalsIgnoreCase(COMPANY_TAG)) {
 bCompany = false;
 }
 }

 @Override
 public void characters(char[] ch, int start, int length) throws SAXException {
 if(bAddress) {
 System.out.print(new String(ch, start, length));
 }

 if(bFirstname) {
 System.out.print(new String(ch, start, length));
 }

 if(bLastname) {
 System.out.print(new String(ch, start, length));
 }

 if(bNickname) {
 System.out.print(new String(ch, start, length));
 }

 if(bSalary) {
 System.out.print(new String(ch, start, length));
 }
 }
}

How to use:

package com.sample.sax;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

public class CompanySimpleMain {
    public static void main(String args[]) {
        try {
            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser saxParser = factory.newSAXParser();

            CompanySimpleHandler handler = new CompanySimpleHandler();

            saxParser.parse("company.xml", handler);
        } catch (Exception e) {
              e.printStackTrace();
        }
    }
}

Output:

[company][name]Great Company, Inc[/name]
[address]Pearl Plaza Great Kuningan, Jakarta Indonesia[/address]
[employee][firstname]Gardiary[/firstname]
[lastname]Rukhiat[/lastname]
[nickname]gardiary[/nickname]
[salary]3400000[/salary]
[/employee]
[employee][firstname]Zinedine[/firstname]
[lastname]Zidane[/lastname]
[nickname]zidane[/nickname]
[salary]2300000[/salary]
[/employee]
[employee][firstname]Justin[/firstname]
[lastname]Bieber[/lastname]
[nickname]jb[/nickname]
[salary]5000000[/salary]
[/employee]
[/company]

XML-to-Object Scenario
Company class:

package com.sample.sax.model;

import java.util.ArrayList;
import java.util.List;

public class Company {
    private String name;
    private String address;
    private List<Employee> employees = new ArrayList<Employee>();

    // please add setter & getter

    public void addEmployee(Employee employee) {
        this.employees.add(employee);
    }
}

Employee class:

package com.sample.sax.model;

public class Employee {
    private String firstname;
    private String lastname;
    private String nickname;
    private Long salary;

    // please make setter & getter
}

Handler:

package com.sample.sax;

import com.sample.sax.model.Company;
import com.sample.sax.model.Employee;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class CompanyHandler extends DefaultHandler {
    private final String COMPANY_TAG = "COMPANY";
    private final String ADDRESS_TAG = "ADDRESS";
    private final String EMPLOYEE_TAG = "EMPLOYEE";
    private final String FIRSTNAME_TAG = "FIRSTNAME";
    private final String LASTNAME_TAG = "LASTNAME";
    private final String NICKNAME_TAG = "NICKNAME";
    private final String SALARY_TAG = "SALARY";
    private final String NAME_ATTRIBUTE = "name";

    private Company company = null;
    private Employee employee = null;

    private boolean bCompany = false;
    private boolean bName = false;
    private boolean bAddress = false;
    private boolean bEmployee = false;
    private boolean bFirstname = false;
    private boolean bLastname = false;
    private boolean bNickname = false;
    private boolean bSalary = false;

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if(qName.equalsIgnoreCase(COMPANY_TAG)) {
            bCompany = true;
            company = new Company();
            company.setName(attributes.getValue(NAME_ATTRIBUTE));   // if you sure, you can use this one
        }

        if(qName.equalsIgnoreCase(ADDRESS_TAG)) {
            if(!bCompany || company==null) {
                throw new SAXException("Company is null");
            }
            bAddress = true;
        }

        if(qName.equalsIgnoreCase(EMPLOYEE_TAG)) {
            if(!bCompany || company==null) {
                throw new SAXException("Company is null");
            }
            bEmployee = true;
            employee = new Employee();
        }

        if(qName.equalsIgnoreCase(FIRSTNAME_TAG)) {
            if(!bEmployee || employee ==null) {
                throw new SAXException("Staff is null");
            }
            bFirstname = true;
        }

        if(qName.equalsIgnoreCase(LASTNAME_TAG)) {
            if(!bEmployee || employee ==null) {
                throw new SAXException("Staff is null");
            }
            bLastname = true;
        }

        if(qName.equalsIgnoreCase(NICKNAME_TAG)) {
            if(!bEmployee || employee ==null) {
                throw new SAXException("Staff is null");
            }
            bNickname = true;
        }

        if(qName.equalsIgnoreCase(SALARY_TAG)) {
            if(!bEmployee || employee ==null) {
                throw new SAXException("Staff is null");
            }
            bSalary = true;
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if(qName.equalsIgnoreCase(ADDRESS_TAG)) {
            bAddress = false;
        }

        if(qName.equalsIgnoreCase(SALARY_TAG)) {
            bSalary = false;
        }

        if(qName.equalsIgnoreCase(NICKNAME_TAG)) {
            bNickname = false;
        }

        if(qName.equalsIgnoreCase(LASTNAME_TAG)) {
            bLastname = false;
        }

        if(qName.equalsIgnoreCase(FIRSTNAME_TAG)) {
            bFirstname = false;
        }

        if(qName.equalsIgnoreCase(EMPLOYEE_TAG)) {
            company.addEmployee(employee);
            bEmployee = false;
        }

        if(qName.equalsIgnoreCase(COMPANY_TAG)) {
            bCompany = false;
        }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        String address;
        String firstname;
        String lastname;
        String nickname;
        Long salary;

        if(bAddress) {
            address = new String(ch, start, length);
            company.setAddress(address);
        }

        if(bFirstname) {
            firstname = new String(ch, start, length);
            employee.setFirstname(firstname);
        }

        if(bLastname) {
            lastname = new String(ch, start, length);
            employee.setLastname(lastname);
        }

        if(bNickname) {
            nickname = new String(ch, start, length);
            employee.setNickname(nickname);
        }

        if(bSalary) {
            salary = Long.parseLong(new String(ch, start, length));
            employee.setSalary(salary);
        }
    }

    public Company getCompany() {
        return company;
    }
}

How to use:

package com.sample.sax;

import com.sample.sax.model.Company;
import com.sample.sax.model.Employee;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

public class CompanyMain {
    public static void main(String args[]) {
        try {
            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser saxParser = factory.newSAXParser();

            CompanyHandler handler = new CompanyHandler();

            saxParser.parse("company.xml", handler);

            Company company = handler.getCompany();
            System.out.println("Company Name : " + company.getName());
            System.out.println("Company Address : " + company.getAddress());
            System.out.println("Employees : ");
            for(int i = 0; i < company.getEmployees().size(); i++) {
                Employee employee = company.getEmployees().get(i);
                System.out.println((i+1) + "\tName : " + employee.getFirstname() + " " + employee.getLastname());
                System.out.println("\tNickname : " + employee.getNickname());
                System.out.println("\tSalary : " + employee.getSalary());
            }
        } catch (Exception e) {
              e.printStackTrace();
        }
    }
}

Output:

Company Name : Great Company, Inc
Company Address : Pearl Plaza Great Kuningan, Jakarta Indonesia
Employees :
1 Name : Gardiary Rukhiat
Nickname : gardiary
Salary : 3400000
2 Name : Zinedine Zidane
Nickname : zidane
Salary : 2300000
3 Name : Justin Bieber
Nickname : jb
Salary : 5000000

Any other implementations can be very different from this sample code. You can get the sampe code from here and here.

4 Comments

  1. Thank you, this was helpful!

  2. This is a really good code to understand how the sax parser works and very tidy coding.
    Thank you very much for the post.

  3. This is very useful.Can you help if we have french characters in xml how to read and write those characters.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: