XML is a versatile data format that is to be used for storing and transporting structured information. A significant amount of configuration files, data interchange, and others are done using XML in Java. For effective manipulation of XML documents in Java, there exists a set of parsers for XML. These parsers are capable of reading XML content and making them readable and editable. Any Java developer who is into XML has to know these parsers.
There are two main groups of Java XML parsers: - DOM (Document Object Model)
- SAX (Simple API for XML).
Each parser type serves different needs, from simple data extraction to complex document manipulation.
This article tries to offer an introduction to these parsers and their subtypes; it will describe their key features and use cases.
XML File Used for Example XML File in Java Below is the XML file to be used with Java Programs:
example.xml
<?xml version="1.0" encoding="UTF-8"?>
<Test>
<case id="1">
<domain>Java</domain>
<count>39</count>
</case>
<case id="2">
<domain>C/C++</domain>
<count>45</count>
</case>
</Test>
Types of XML Parsers1. DOM (Document Object Model) ParserOverviewThe DOM parser reads the entire XML document and builds an in-memory tree representation, which allows the document to be traversed and manipulated by normal DOM APIs.
Features- Tree View: This represents the XML document as a tree of nodes.
- Random Access: All nodes can be accessed and modified freely at any time.
- Rich API: traversal, manipulations, and querying methods over the document.
Use Cases- Complex XML Documents: Useful to the documents where the nodes are supposed to be accessed and changed quite often.
- In-Memory Operations: Ideal for applications that require taking the entire XML structure into memory and manipulating it.
Pros and Cons- Pros: Can be easily used and has robust navigation and modification abilities.
- Cons: Memory intensive, inefficient for large documents.
Example
Java
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
public class DomParserExample {
public static void main(String[] args) {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse("example.xml");
NodeList nodeList = document.getElementsByTagName("exampleTag");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
System.out.println(node.getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Output:Java C/C++ 2. Simple API for XML (SAX) parserOverviewSimple API for XML Parser is event-driven, just like an event-driven parser, but it has the additional ability to perform serial access. In this regard, it does not load the entire document into memory, as does the DOM parser; instead, it reads the document sequentially and generates events, such as when elements start and finish, which can be acted upon by custom event handlers.
Features- Event-Driven: It parses the document and raises the events of elements and attributes.
- Low Memory Usage: It processes the document so that the entire document is not necessarily stored in memory.
- Fast Performance: Quick for large documents due to sequential access.
Use Cases- Large XML Documents: Suitable for large documents where processing is needed for only some pieces.
- Streaming Requirements: Ideal for applications that work with XML data in a streaming fashion.
Pros and Cons- Pros: Low memory footprint, fast processing.
- Cons: Hard to implement, no random access to elements.
Example
Java
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.Attributes;
public class SaxParserExample {
public static void main(String[] args) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
saxParser.parse("example.xml", new MyHandler());
} catch (Exception e) {
e.printStackTrace();
}
}
}
class MyHandler extends DefaultHandler {
public void startElement(String uri, String localName, String qName, Attributes attributes) {
System.out.println("Start Element: " + qName);
}
public void endElement(String uri, String localName, String qName) {
System.out.println("End Element: " + qName);
}
public void characters(char[] ch, int start, int length) {
System.out.println("Characters: " + new String(ch, start, length));
}
}
Output:Start Element: Test Characters:
Start Element: case Characters:
Start Element: domain Characters: Java End Element: domain Characters:
Start Element: count Characters: 39 End Element: count Characters:
End Element: case Characters:
Start Element: case Characters:
Start Element: domain Characters: C/C++ End Element: domain Characters:
Start Element: count Characters: 45 End Element: count Characters:
End Element: case Characters:
End Element: Test 3. StAX (Streaming API for XML) ParserOverviewStAX is a pull-parsing model of XML. It provides an application developer with the ability to pull events from the parser, such as the start and end of elements, when needed, and thus dramatically controls the parsing process.
Features- Pull-Based: Control-based parsing is where developers control the parsing process by pulling events.
- Moderate Memory Usage: More efficient in memory than DOM, but not that much as SAX.
- Bidirectional Parsing: It allows for both forward and backward traversal of the document.
Use Cases- Moderate-Sized Documents: Used in applications that require a balance between memory consumption and ease of use.
- Complex Processing Logic: Ideal for situations in which complex document processes are required.
Pros and Cons- Pros: Well-balanced in memory usage and control; flexible.
- Cons: May be more complicated than SAX, and not as efficient for very large documents.
Example
Java
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamReader;
import javax.xml.stream.XMLStreamConstants;
import java.io.FileReader;
public class StaxParserExample {
public static void main(String[] args) {
try {
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("example.xml"));
while (reader.hasNext()) {
int event = reader.next();
switch (event) {
case XMLStreamConstants.START_ELEMENT:
System.out.println("Start Element: " + reader.getLocalName());
break;
case XMLStreamConstants.END_ELEMENT:
System.out.println("End Element: " + reader.getLocalName());
break;
case XMLStreamConstants.CHARACTERS:
if (reader.hasText()) {
System.out.println("Characters: " + reader.getText().trim());
}
break;
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Output:Start Element: Test Characters: Start Element: case Characters: Start Element: domain Characters: Java End Element: domain Characters: Start Element: count Characters: 39 End Element: count Characters: End Element: case Characters: Start Element: case Characters: Start Element: domain Characters: C/C++ End Element: domain Characters: Start Element: count Characters: 45 End Element: count Characters: End Element: case Characters: End Element: Test 4. JAXB – Java Architecture for XML BindingOverviewJAXB allows Java developers to map Java objects with XML representations and also assists in the reverse—verting XML representations to Java objects. The mapping of XML representations to Java objects and vice versa is vastly enhanced.
Features- Object-XML Mapping: A technology that converts Java objects to XML and those that convert them back.
- Annotations: Annotations can be used to map Java classes with XML elements.
- Binding: In other words, automatically handling the binding between the Java objects and XML.
Use Cases- Data binding: Ideal for applications that require frequent sweeping back and forth between Java-object and XML conversions.
- Configuration Files: Used by any application that uses XML for its configuration.
Pros and Cons- Pros: This simplifies object-XML conversion, making the boilerplate code small.
- Cons: Less control over XML parsing compared to other methods.
Example
Java
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import java.io.StringReader;
import java.io.StringWriter;
public class JaxbExample {
public static void main(String[] args) {
try {
JAXBContext context = JAXBContext.newInstance(Person.class);
// Marshalling - Convert Java object to XML
Person person = new Person("John", 30);
StringWriter writer = new StringWriter();
Marshaller marshaller = context.createMarshaller();
marshaller.marshal(person, writer);
System.out.println("XML Output:");
System.out.println(writer.toString());
// Unmarshalling - Convert XML to Java object
StringReader reader = new StringReader(writer.toString());
Unmarshaller unmarshaller = context.createUnmarshaller();
Person unmarshalledPerson = (Person) unmarshaller.unmarshal(reader);
System.out.println("Java Object:");
System.out.println(unmarshalledPerson);
} catch (Exception e) {
e.printStackTrace();
}
}
}
class Person {
private String name;
private int age;
// Default constructor is required for JAXB
public Person() {}
public Person(String name, int age) {
this.name = name;
this.age = age;
}
// Getters and setters are required for JAXB
public String getName() { return name; }
public void setName(String name) { this.name = name; }
public int getAge() { return age; }
public void setAge(int age) { this.age = age; }
@Override
public String toString() {
return "Person{name='" + name + "', age=" + age + '}';
}
}
Output:XML Output: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <person> <age>30</age> <name>John</name> </person> Java Object: Person{name='John', age=30} Conclusion Java’s power to handle XML is in the rich set of tools for parsing and handling the language. The DOM parser is good when working with an XML in-memory setup; the SAX parser works well within a low-memory, high-performance environment; and the StAX parser, appropriate for a good balance between the two, will keep you in control of the parsing process. JAXB is designed to be easier with object-XML mapping, thus pretty good with applications requiring frequent data binding. Selection of a parser will require identification of an application’s needs relative to other factors, including document size, memory available, and complexity of the XML processing to be undertaken. Having more information about these parsers will make you better prepared to manage XML well in your Java applications.
|