Defining the schema and templates for generating web articlesLast time we defined an XML Schema and a
set of XSL templates to generate a web site. We implemented
generic, minimal web page support. This time we will particularize it
for a very common subclass of web pages, articles. Note: We consider an article to be a web page that deals mostly with text
content. The article modelWe will start by identifying the elements that an article has
and we want to have in XML: titles and article sections. Article
sections can in turn contain titles, content (paragraphs, lists, etc.)
and other sections. The XML SchemaSo we'll define a section element which can
contain: a title, followed by content blocks and/or other
sections. For now, we will implement only one content block: a plain
text paragraph. The section type
A section can have: - A title
- One or more text paragraphs
- One or more sub-sections
All these are optional and can be omitted. At least a title
or a paragraph must be present though. <xs:complexType name="sectionType">
<xs:sequence>
<xs:element name="title" type="xs:string" minOccurs="0"/>
<xs:element name="paragraph" type="xs:string" minOccurs="0"
maxOccurs="unbounded"/>
<xs:element name="section" type="sectionType" minOccurs="0"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
The article typeWe will define the article type as an
extension of the page type. The article itslef is just
a sequence of sections, with at least one section per article being
required. 
<xs:complexType name="articleType">
<xs:complexContent>
<xs:extension base="pageType">
<xs:sequence>
<xs:element name="section" type="sectionType" maxOccurs="unbounded"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
The website type
The website, previously defined
to accept pages, must be changed now to accept both pages and
articles, in any order. <xs:complexType name="websiteType">
<xs:sequence>
<xs:choice maxOccurs="unbounded">
<xs:element name="page" type="pageType"/>
<xs:element name="article" type="articleType"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
The Website XML contentLet's input some test content that respects the defined
schema. <article id="article">
<section>
<title>Section 1</title>
<paragraph>Paragraph</paragraph>
<section>
<title>Sub-Section 1.1</title>
<section>
<title>Sub-Section 1.1.1</title>
<paragraph>Text</paragraph>
</section>
</section>
</section>
<section>
<paragraph>Section 2 Paragraph</paragraph>
</section>
</article>
The XSLT processing rulesIn first stage we will ignore the recursive nature of the
embedded sections, obtaining a flat view of titles and paragraphs. The article templateWe will extend the page template to match both pages and
articles. The template body will stay pretty much the same, except
that we will apply templates inside the HTML body tag. <xsl:template match="page | article">
<xsl:result-document href="{@id}.html">
<html>
<head/>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:result-document>
</xsl:template>
The paragraph templateThis rule will match each paragraph element and it will
surround its content in HTML p tags. <xsl:template match="paragraph">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
The title templateThis template will match each title element and it will
surround its content in HTML H tags. We just use the H1 tag for now,
in the future we will use the section level to decide which H tag to
output. <xsl:template match="title">
<h1>
<xsl:apply-templates/>
</h1>
</xsl:template>
The generated HTMLNothing fancy at this stage, just the titles and text
paragraphs. <!DOCTYPE html
PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
</head>
<body>
<h1>Section 1</h1>
<p>Paragraph</p>
<h1>Sub-Section 1.1</h1>
<h1>Sub-Section 1.1.1</h1>
<p>Text</p>
<p>Untitled Section 2 Paragraph</p>
</body>
</html>
Download: article files. Read on: Processing the titles of recursive page sections. First Posted: December 14th, 2005 - Wednesday. Last Updated: December 15th, 2005 - Thursday. |