The XML SiteXML schema and XSL templates for a web site

Defining a minimal schema and templates for generating a web site

Now that we have everything we ned to start out, let's define the XSL Schema and XSL templates that we can use to generate a minimal web site. First of all, we need to decide on the layout of the website pages on disk.

Hierarchical vs. flat website page disk layout

There are two main choices when it comes to the disk layout of pages in a website:

  • The first style is a flat model, where all the pages are placed in the website root (with different names, of course). There is only one index file and the internal website hierarchy is accomplished through navigation only.
  • The second style is a hierarchical one, where pages are grouped in directories according to various criteria like topic. Each folder has an index which serves as the main portal of that section.

We will use the flat web file system model, mostly because it is simpler to implement and it has no known disadvantages over the hierarchical mode.

The XML Schema

It is implemented in the example.xsd file. We will define the styles top-down, starting from the web site type.

The web site schema

This includes the web site schema and the page schema.

Website TypeThe web site type

At the very least, a website is a collection of (one or more) pages. We will add later constructs that allow us to define a website structure for navigation and connections between web pages. So for now, just add the following in the example.xsd XML schema file:

  <xs:complexType name="websiteType">
    <xs:sequence>
      <xs:element name="page" type="pageType" maxOccurs="unbounded">
    </xs:sequence>
  </xs:complexType>

Web Page TypeThe web page type

The only thing we require in a minimal web page is a unique identifier. This ID will be used to name the generated HTML file and to specify the target of a link to the specific web page. The generated HTML files require unique names because they all are placed in the root directory. This is a consequence of the flat (and not hierarchical) website file system model we selected.

  <xs:complexType name="pageType">
    <xs:attribute name="id" type="xs:ID" use="required"/>
  </xs:complexType>

The web site element

Finally, we need to declare the root element in the XML file, the web site:

  <xs:element name="website" type="websiteType"/>

The web site XML content

Now that we have a schema, we can add some content to the XML file, example.xml.

<website xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="example.xsd">
  <page id="index"/>
  <page id="about"/>
</website>

We just defined a website with two pages: an index page and an about page.

The XSLT processing rules

In the example.xsl XSLT sheet we need to define the output method for the processor and a page template to match each page element in our XML website.

The output method

We're outputing HTML 4.01 web pages, in the ISO-8859-1 encoding. XHTML is also an option, and we only have to change here in order to switch. This is the nice thing about keeping your content in XML: it can be easily outputed it in any desired web format.

  <xsl:output method="html" version="4.01" encoding="ISO-8859-1" indent="yes"
    doctype-public="-//W3C//DTD HTML 4.01//EN"
    doctype-system="http://www.w3.org/TR/html4/strict.dtd"/>

Note: We are indenting the output for easier debug and readability of the generated HTML files.

The page template

 
  <xsl:template match="page">
    <xsl:result-document href="{@id}.html">
      <html>
        <head/>
        <body/>
      </html>
    </xsl:result-document>
  </xsl:template>

This template matches each page element and redirects the future output to an HTML file having the same name as the page ID.

Running the XSLT processor

Of course, we have an ant task defined as default just for that in the build.xml file. So all we have to do is type:

ant

at the command line. The end result should be 2 empty HTML pages:

<!DOCTYPE html
  PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   </head>
   <body></body>
</html>

generated in two files, index.html and about.html.

Adding a clean target to the build file

Now that we are creating files, we ought to remove them as well. This should be done in a clean target of our and build file, build.xml:

  <target name="clean" description="Clean build.">
    <delete>
      <fileset dir="www" includes="*.html"/>
    </delete>
  </target>

Download: article files.

Read on: XML Schema and XSL templates for web articles.

First Posted: November 27th, 2005 - Sunday.
Last Updated: December 14th, 2005 - Wednesday.