Those beginning with XML files are often perplexed by the way information is organized in such files. All those tags make them think of some basic questions such as "But why?" which furrows their foreheads, but they don't seem to voice those questions possbily due to the risk of embarassing themselves

I've found it easy to explain the whys of XML files using an architectural example. So without further digressions, let us state the architectural problem that is being addressed:

Statement of the problem

Let's say that both you and I have been assigned the task of placing a humungous amount of loose sheets of paper (with information written on it) in a methodical manner into a warehouse -- which is nothing but a gigantic, empty aircraft hanger. In order to easen our task, we have been given access to an unlimited number of cardboard boxes of varying sizes, and an unending supply of marker pens. It can also be assumed that we have the capability of knowing the categories to which these sheets of paper could belong to.

Step 1: An obvious question

Now, the first question -- an obvious one: What is the easiest way to put those sheets into the warehouse?

An obvious answer: Let us just dump them into the warehouse without using any cardboard boxes whatsoever. Let us represent that using the XML shown below:

 <warehouse>
   sheet1_a...
   accSheet1...
   sheet1_b...
   sheet1_c...
   accSheet2...
   accSheet3...
   accSheetn... 
   sheet1_n...
 </warehouse>

An XML is a file that can be edited using any text editor. Now we both know that a text file normally does not have the capability of distinguishing such conceptual containers from the contents in those containers. We'll soon see how neatly XML solves such a problem.

In the above example; as we have dumped all the sheets directly into the warehouse, the only container that we are now talking about is the warehouse itself. You (as a human) may perceive some common pattern but for the time being you have resisted organizing the sheets any futher other than just putting them into the floor of the warehouse.

XML has a neat way to demarcate the container. You actually give a name to each container and put such a name (in the above example the name is 'warehouse') within angular brackets thus: <warehouse>. In XML, what we have been calling as a container is conceptually known as an xml element and its physical manifestation (i.e. the angular brackets et al) is known as a tag (Some XML texts erroneously mixup the term element with tag. The word element stands for the idea. The word tag is the manifestation of that idea in the text file)

But wait a second; that still does not explain where the container will end within the text file, so that anyone who reads the file would be able to distinguish containers from the contents of such containers within the XML. To do that, XML uses this convention: If <warehouse> starts a container then </warehouse> ends it. i.e. the ending tag would having the same name as that of the starting tag; but the name would be preceded by a front-slash

Step 2: The second obvious question

But that still does not explain how we have got those sheets organized, does it?. Merely dumping the sheets directly into the warehouse has only marginally improved the situation than letting them lie loose here, there and everywhere. The only thing that has so far happened is that we've collected those sheets into the warehouse.

At this point, it must be obvious to you that we would need to use one or more of those cardboard boxes to categorize those sheets. So out comes a cardboard box, out comes a marker and there you go scrawling an identifiable name on the box; and then in goes the relevant sheets of paper into that box. Pheew... Looks like a lot of work, but it really isn't. It has made the sheets a bit more organized. Granted: We may not have been able to put ALL the sheets into boxes -- some may still be lying outside the box (albeit still within the warehouse).

So how does that space-usage metaphor translate into XML?

Here it is now:

 <warehouse>
   sheet1_a...
   sheet1_b...
   sheet1_c...
    <accinfo>
      accSheet1...
      accSheet2...
      accSheet3...
      accSheetn... 
    </accinfo>
   sheet1_n...
  </warehouse>

You may notice that sheet1_a, sheet1_b, sheet1_c are still lying on the floor of the warehouse. But now accSheet1, accSheet2 and accSheet3, etc. has gone into their assigned container (i.e. the cardboard box) -- which has been tagged as "accinfo"; so I presume it means that those acc... sheets were something to do with accounts.

One of the beautiful concepts in XML is that it gives the XML author the capability to describe the logical structure of the thingamjig that is being arranged, whatever it may be. This is markedly different from HTML which often intermingles with visual aspects of the presentation of the data. If you don't understand all that now; don't worry. It will be easy shortly