An Introductory Concept of XML


XML

XML stands for Extensive Markup Language. XML enables a new generation of Web-based applications for viewing and manipulating data. 
The ADO.NET and XML.NET Framework Application Programming Interface (API) combination provides a unified way to work with XML in the Microsoft .NET Framework.  XML is a member of SGML and an extended version of HTML. Unlike HTML, XML stores and exchange the data. In HTML we can work with limited number of tags while in XML we can define our own tags.

 

The following example shows that how we can write our XML program.

The first line of an XML file is: <? Version ="1.0" ? > This line defines the XML version of the document. This tag tells the browser to start executing the file.

<?xml version="1.0" ? >
<bookstore>

    <book>

        <title> The Autobiography of Benjamin Franklin</title>

         <author>

            <first-name>Benjamin</ first-name >

            <last-name>Franklin</last-name>

         </author>

        <price> 8.99</price>

    </book>

    <book>

        <title> The Confidence Man</title>

         <author>

            <first-name>Herman</ first-name >

            <last-name>Melville</last-name>

         </author>

        <price> 11.99</price>

    </book>

</bookstore>


When view this document in browser, the output looks like this:

<?xml version="1.0" ? >
<
bookstore>

    <book>

        <title> The Autobiography of Benjamin Franklin</title>

         <author>

            <first-name>Benjamin</ first-name >

            <last-name>Franklin</last-name>

         </author>

        <price> 8.99</price>

    </book>

    <book>

        <title> The Confidence Man</title>

         <author>

            <first-name>Herman</ first-name >

            <last-name>Melville</last-name>

         </author>

        <price> 11.99</price>

    </book>

</bookstore>

 

Browser recognizes the XML and colors it appropriately. In this above example <bookstore> is the root node. Every XML document must start with a root node with the starting tag and end with root node ending tag; otherwise XML parser gives an error.

Important Characteristics of XML

XML is a case sensitive. In XML <BOOKS> and <books> are two different tags. All tags in XML must be well formed and must have a closing tag. Improper nesting of tags in XML won't parse the document properly.

For Example:   

<b> <i>Bold and Italic </b></i>

This is not well form. The well formed nesting tags will be:

<b>

  <i>Bold and Italic</i>

</b>

 

XML Parser

 

An XML parser is a program that sits between XML document and the application using the document. The job of a parser is to make sure the documents meet the defined structures, validation and constraints. We can define validation rule and constraints in a document type definition (DTD).

XML Namespaces

System.XML is the overall namespace for XML classes that provide standards based supports for processing XML.It also covers support for XML Path Language (XPath) queries and Extensibile Stylesheet Language Transformation (XSLT).

The .NET XML stack is partitioned over five important namespaces.

System.XML Whch provides XML standards-based support for processing XML.

System.XML.XPtah Which contains the XPath parser and evalution engine.

System.XML.Xsl Which provides support for XSLT transforms.

System.XML.Scheme Which contains the XML classes that provide XML standard-basd support for scheme created with the XML Scheme definitions language(XSD).

System.XML.Serialization which contains classes that are used to serialize object into XML format documents or streams.

These namespaces are packaged inside the System.XML.dll assembly.
User can define an XML document's element names; it's possible that many developers will use the same name. XML namespaces allow developers to write a unique name and avoid conflicts between element names with other developers. With the help of URI, a namespace ensure the uniqueness of XML elements, tags and attribute.

The scope of a document's element depends on the URI. The following example shows a XML document with namespace.

 

<?xml version="1.0" ? >
<
book xmlns="http://wwww.c-sharpcorner.com/Images">

   <title> The Autobiography of Benjim Franklin</title>

   <author>

       <first-name>Benjamin</ first-name >

        <last-name>Franklin</last-name>

   </author>

   <price> 8.99</price>

</book> 

 

DTD(Document Type Definition)

 

A Document Type Definition (DTD).defines a document structure with a list of legal elements. We can declare DTDs inline or as a link to an external file. We can also use DTD's to validate XML documents. This example shows the DTDs:

<! ELEMENT Two (#PCDATA)>

<! ELEMENT One (B)>

<! ATTLIST One c CDATA # REQUIRED>

This DTD define a format of data. The following XML is valid because the tag<two> is inside the tag<One>:

<One c="Attrib">

  <Two> Text</Two>

</One>

An XML schema describes the relationship between a document's elements and attributes. XML schemas describe the rules, which can be applied to any XML document, for elements and attributes.  If an XML documents references a schema and it does not meet the criteria, XML parser will give an error during parsing.

A schema starts with a<xsd:schema> tag and ends with </xsd:schema> tag. All schema items have the prefix xsd. The xmlns=http://www.w3.org/2001/XMLSchema is a http://www.w3.org URI, which indicates the schema should be interpreted according to the default namespace of the w3c.

The next piece of this line is the target namespace, which indicates the location of a machine (a URI). Following example shows a schema:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

 

  <xsd:element name="bookstore" type="bookstoretype"/>

 

  <xsd:complexType name="book StoreType">

    <xsd:sequence maxoccurs="unbounded">

      <xsd:element name="book" type="boomtype"/>

    </xsd:sequence>

  </xsd:complexType>

 

  <xsd:complexType name="book Type">

    <xsd:sequence maxoccurs="unbounded">

      <xsd:element name="title"    type="xsd:string"/>

      <xsd:element name="author" type="xsd:authorname"/>

      <xsd:element name="price" type="xsd:decimal"/>

    </xsd:sequence>

    <xsd:attribute name="genre" type="xsd:string">

    </xsd:complexType>

 

  <xsd:complexType name="authorname">

    <xsd:sequence >

      <xsd:element name="first-name" type="xsd:string"/>

      <xsd:element name="last-name" type="xsd:string"/>

    </xsd:sequence>

  </xsd:complexType>

 

</xsd:schema> 

 

Extensible Hypertext Markup Language (XHTML)

 

Extensible Hypertext Markup Language (XHTML) is a next-generation language of HTML. XHTML is a better and improved version of HTML. XHTML is a combination of XML and HTML. XHTML uses elements of HTML 4.01 and rules of XML to provide a more consistent, well formed and organized language.

 

An XML Documents and Its Items

 

An XML document is a set of elements in a well formed and valid standard format. A document is valid if it has a DTD associated with it and if it compiles with the DTD.

An XML document has the following parts: 

  1. Prolog
  2. DOCTYPE declaration
  3. Start and end tags
  4. Comments
  5. Character and entity references
  6. Empty Elements
  7. Processing Instructions
  8. CDATA section
  9. Attributes
  10. White Spaces  

Prolog

 

The prolog part of a document appears before the root tag. The prolog information applies to the entire document. It can have character encoding, stylesheets, comments and processing instructions. Let see the prolog with the following example.

 

<? Version ="1.0" ? >

<? Xml-stylesheet type ="text/xsl" href="books.xsl ?>
   <!
DOCTYPE StudentRecord  SYSTEM "mydtd.dtd">

<!=my comments-->

 

DOCTYPE Declaration

 

With the help of a DOCTYPE declaration, we can read the structure of our root element and DTD from external files. A DOCTYPE declaration can contain a root element or a DTD. In a validating environment, a DOCTYPE declaration is must. In a DOCTYPE reference, we can even use a URI reference. See the following example:

<! DOCTYPE rootElement>

 

  or

 

  <! DOCTYPE rootElement SYSTEM "URIreference">

 

  Or

 

  <!DOCTYPE StudentRecord SYSTEM "mydtd.dtd"> 

 

Start and End Tags

Start and End tag are the heart of XML language. If we want to start a tag like as <book> to our XML file them it will end with </book>

 

Comments

Using comments in code is a good programming practice. Comments help us to understand with our coding. We can use comment like as:

<!--     My Comment Here -->

XML parsers ignore the comments.

 

CDATA Section

 

What if we want to use <and> characters in our XML file but not as a part of a tag?  Well, we can not use them because XML parser will interpret them as start and end tags. CDATA provides the following solution, so we can use XML markup characters in your documents and have the XML parser ignore them. See the following example how we will use CDATA.

<! [CDATA [I want to use < and > , characters]]>

The parser will treat those characters as data. Another good example of CDATA is:

<![CDATA [<Title> This is the title of a page</Title>

 

In this case, the parser will treat the second Title as data, not as a markup tag.

 

Empty Elements

 

Empty elements start and end with same tag. They start with <and end with>. The text between these two symbols is the text data. For example:

<Name> </Name>

<IMG  SRC="img.jpg">

  <tagname/>

 

All these are empty element example. The <IMG> specifies an inline image, and the SRC attribute specifies the image's location. The image can be any format, though browsers generally support only GIF, JPEG and PNG images.

 

Processing Instructions

 

Processing Instructions (PIs) play a vital role in XML parsing. A PI holds the parsing instructions, which are read by the parser and other programs.

 

<?xml version="1.0" ? >

All PIs start with <? And end with ?> 

 

Attributes

 

Attribute you add extra information to an element without creating another element. An attribute is a name and value pair. Both the name and value must be present in an attribute. The attribute value must be in double quotes; otherwise the parser will give the error. Let see the attribute in table tag. In this following example:

 <table border ="1" width="43%">

  <tr>

     <td width ="50%"> Row1, Column1</td>

     <td width ="50%"> Row1, Column2</td>

   </tr>

   <tr>

     <td width ="50%"> Row2, Column1</td>

     <td width ="50%"> Row2, Column2</td>

   </tr>

</table>

 

White Spaces

XML preserves white spaces except in attribute values. That means white space in our document will be displayed in browser. However white space are not allowed before the XML declaration. The XML parser reports all white spaces available in the document. If white space appear before declaration, the parser treats them as a PI. 

Up Next
    Ebook Download
    View all
    Learn
    View all