Reading XML Files
The following are ways to read and navigate the content of an XML file:
Using XmlDocument: You can load the document using the XmlDocument class mentioned
earlier. This holds all the XML data in memory once you call Load() to retrieve it from a file or
stream. It also allows you to modify that data and save it back to the file later. The XmlDocument
class implements the full XML DOM.
Using XPathNavigator: You can load the document into an XPathNavigator (which is located
in the System.Xml.XPath namespace). Like the XmlDocument, the XPathNavigator holds the
entire XML document in memory. However, it offers a slightly faster, more streamlined model
than the XML DOM, along with enhanced searching features. Unlike the XmlDocument, it
doesn’t provide the ability to make changes and save them.
Using XmlTextReader: You can read the document one node at a time using the XmlTextReader
class. This is the least expensive approach in terms of server resources, but it forces you to examine the data sequentially from start to finish.
The following sections demonstrate each of these approaches to loading the VIDEO list XML document.
Using the XML DOM
The XmlDocument stores information as a tree of nodes. A node is the basic ingredient of an
XML file and can be an element, an attribute, a comment, or a value in an element. A separate
XmlNode object represents each node, and nodes are grouped together in collections.
You can retrieve the first level of nodes through the XmlDocument.ChildNodes property. In this
example, that property provides access to the <VideoList> element. The <VideoList> element contains
other child nodes, and these nodes contain still more nodes and the actual values. To drill down
through all the layers of the tree, you need to use recursive logic, as shown in this example.
When the example page loads, it creates an XmlDocument object and calls the Load() method,
which retrieves the XML data from the file. It then calls a recursive function in the page class named
GetChildNodesDescr(). GetChildNodesDescr() takes an XmlNodeList object as an input and the
index of the nesting level. It then returns the string with the content for that node and all its child
nodes and attributes.
private void Page_Load(object sender, System.EventArgs e)
{
string xmlFile = Server.MapPath("VideoList.xml");
// Load the XML file in an XmlDocument.
XmlDocument doc = new XmlDocument();
doc.Load(xmlFile);
// Write the description text.
XmlText.Text = GetChildNodesDescr(doc.ChildNodes, 0);
}
When the Page.Load event handler calls GetChildNodesDescr(), it passes an XmlNodeList
object that represents the first level of nodes. (The XmlNodeList contains a collection of XmlNode
objects, one for each node.) The code also passes 0 as the second argument of GetChildNodes-
Descr() to indicate that this is the first level of the structure. The string returned by the GetChild-
NodesDescr() method is then shown on the page using a Literal control.
The interesting part is the GetChildNodesDescr() method. It first creates a string with three
spaces for each indentation level that it will later use as a prefix for each line added to the final
HTML text.
private string GetChildNodesDescr(XmlNodeList nodeList, int level)
{
string indent = "";
for (int i=0; i<level; i++)
indent += " ";
...
Next, the GetChildNodesDescr() method cycles through all the child nodes of the XmlNodeList.
For the first call, these nodes include the XML declaration, the comment, and the <VideoList> element.
An XmlNode object exposes properties such as NodeType, which identifies the type of item
(for example, Comment, Element, Attribute, CDATA, Text, EndElement, Name, and Value). The code
checks for node types that are relevant in this example and adds that information to the string, as
shown here:
...
StringBuilder str = new StringBuilder("");
foreach (XmlNode node in nodeList)
{
switch(node.NodeType)
{
case XmlNodeType.XmlDeclaration:
str.Append("XML Declaration: <b>");
str.Append(node.Name);
str.Append(" ");
str.Append(node.Value);
str.Append("</b><br />");
break;
case XmlNodeType.Element:
str.Append(indent);
str.Append("Element: <b>");
str.Append(node.Name);
str.Append("</b><br />");
break;
case XmlNodeType.Text:
str.Append(indent);
str.Append(" - Value: <b>");
str.Append(node.Value);
str.Append("</b><br />");
break;
case XmlNodeType.Comment:
str.Append(indent);
str.Append("Comment: <b>");
str.Append(node.Value);
str.Append("</b><br />");
break;
}
...
Note that not all types of nodes have a name or a value. For example, for an element such as
Title, the name is Title, but the value is empty, because it’s stored in the following Text node.
Next, the code checks whether the current node has any attributes (by testing if its Attributes
collection is null). If it does, the attributes are processed with a nested foreach loop:
...
if (node.Attributes != null)
{
foreach (XmlAttribute attrib in node.Attributes)
{
str.Append(indent);
str.Append(" - Attribute: <b>");
str.Append(attrib.Name);
str.Append("</b> Value: <b>");
str.Append(attrib.Value);
str.Append("</b><br />");
}
}
...
Lastly, if the node has child nodes (according to its HasChildNodes property), the code recursively
calls the GetChildNodesDescr function, passing to it the current node’s ChildNodes collection
and the current indent level plus 1, as shown here:
...
if (node.HasChildNodes)
str.Append(GetChildNodesDescr(node.ChildNodes, level+1));
}
return str.ToString();
}
When the whole process is finished, the outer foreach block is closed, and the function returns
the content of the StringBuilder object.