Use of XPath in Java: Part 1

XPath is a query language for traversinig through XML files. It is a syntax for defining parts of XML documents. It uses a path expression to navigate through XML documents.

Xpath expressions are of great help when we want to operate on some specific node in an XML document. I will be creating a XML file and then will create a Java class to read that file and use XPath expressions to navigate through elements in XML. The following procedure will help you in performing some operations.

1. I have created a XML file named file.xml and placed it in my E drive. The File.xml content is:

  1. <?xml version="1.0" encoding="UTF-8" standalone="no"?>  
  2.    <Technologies>  
  3.       <Node0>.Net  
  4.          <Node1>Abstract  
  5.          <Node2>Class  
  6.          <Node3>Oops</Node3></Node2></Node1>  
  7.          <Node1>CLR</Node1>  
  8.       </Node0>  
  9.       <Node0> Java</Node0>  
  10.       <Node0>PHP</Node0>  
  11.       <Node0>C#</Node0>  
  12.    </Technologies> 

2. Now we will create a Java class file to read this XML file and then do various XPath operations on this file. The code to read the XML file is:

  1. File f = new File("E:\\file.xml");  
  2. Document doc =null;  
  3. DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder docBuilder;  
  4. docBuilder = docFactory.newDocumentBuilder();  
  5. doc = docBuilder.parse(f);  
  6. doc.getDocumentElement().normalize(); 

3. Now we will create an instance of XPath and then will use expressions to navigate through the XML.

  1. XPath xpath = XPathFactory.newInstance().newXPath(); 
  1. The following is an Expression to get the root element of the XML file.
    1. NodeList name = (NodeList)xpath.compile("/*").evaluate(doc, XPathConstants.NODESET); 
    This expression will return a NodeList and we get the element name of the root node using a for loop. The code is as follows:
    1. for(int i=0;i<name.getLength();i++)  
    2. {  
    3.    System.out.println(name.item(i).getNodeName());  

    The output after the running this code is:

    Technologies

  2. The following is an Expression to get the text of all Node0 elements.
    1. NodeList name = (NodeList)xpath.compile("//Node0/text()").evaluate(doc, XPathConstants.NODESET); 
    This expression will return a NodeList and we can get the text content of all the nodes using a for loop. The code is as follows.
    1. for(int i=0;i<name.getLength();i++)  
    2. {  
    3.    System.out.println(name.item(i).getTextContent());  

    The output after the running this code is:

    .Net

    Java

    PHP

    C#

  3. The following is an Expression to get all the child nodes inside a specific node (Node0).
    1. NodeList name = (NodeList)xpath.compile("//Node0/*").evaluate(doc, XPathConstants.NODESET); 
    This expression will return a NodeList and we can get the text content of all the nodes using a for loop. The code is as follows:
    1. for(int i=0;i<name.getLength();i++)  
    2. {  
    3. System.out.println(name.item(i).getTextContent());  

    The output after running this code is:

    Abstract
    Class
    Oops
    CLR

  4. The following is an Expression to get the element names of all the children of a specific node (Node0).
    1. NodeList name = (NodeList)xpath.compile("//Node0/*").evaluate(doc, XPathConstants.NODESET); 
    This expression will return a NodeList and we get the element name of the root node using a for loop. The code is as follows:
    1. for(int i=0;i<name.getLength();i++)  
    2. {  
    3.    System.out.println(name.item(i).getNodeName());  

    The output after the running this code is:

    Node1
    Node1

I will be coming up with XPath operators and some more expressions in future articles. I hope this article was useful to you.

Up Next
    Ebook Download
    View all
    Learn
    View all