Generating a Table of Contents from DocBook
with XSLT
Demo in class on April 1st, 2015.
- Printout
- First step : <sect1>
only
- 2nd step : the
whole 3 level TOC
- HTML
- XML
- XSL-FO
-
We start with exactly the same purpose as the one we achieved with
XPath in XP-TOC
:
write a stylesheet MakeTOC.xsl
such that
$xsltproc MakeTOC.xsl IntroDocBookX.xml
1.Principle
1.1. What is DocBook ?
1.1.1. Standards
1.1.2. References
2 1.2. Main features
2.Stylesheets
2.1. Availability
2.2. Usage through customization
2.3. Use a CSS to easily customize HTML output
2.4. The basic technique is overriding imported
rules
-
Intended execution
$ xsltproc makeTOC-1.xsl
IntroDocBookX.xml
Table of Contents
1. Principle
2. Stylesheets
$
Stylesheet makeTOC-1.xsl
<?xml version='1.0' ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:bk="http://docbook.org/ns/docbook"
>
<xsl:output method="text"/>
<xsl:template match="/">
Table of Contents
<xsl:apply-templates select="//bk:sect1" />
</xsl:template>
<xsl:template match="bk:sect1">
<xsl:value-of select="bk:title/text()"
/><xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
Explanatory notes (in addition to the main Introduction)
- We want a printout, i.e. producing pure text, hence the
choice of
<xsl:output method="text"/>
(as opposed to method="html"
or method="xml"
that we shall see later).
- Note the use of
<xsl:apply-templates select="...." />
necessary to prevent the default rules from applying to the many nodes
of our document
of which we don't wish to take account.
- The bizarre construct
<xsl:text> </xsl:text>
is indeed the best way to specify a line-feed !
(Remember, '\n
' = ASCII #10...)
Note that the actual execution gives a result that is slightly
different from expected :
$ xsltproc makeTOC-1.xsl
IntroDocBookX.xml
Table of Contents
1. Principle
2. Stylesheets
$
with an intial linefeed that was not asked for...
Howzat?
-
Stylesheet
makeTOC-2.xsl
<?xml version='1.0' ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:bk="http://docbook.org/ns/docbook"
>
<xsl:output method="text"/>
<xsl:template match="/">
Table of Contents
<xsl:apply-templates select="//bk:sect1" />
</xsl:template>
<xsl:template match="bk:sect1">
<xsl:value-of select="bk:title/text()"
/><xsl:text> </xsl:text>
<xsl:apply-templates select="bk:sect2" />
</xsl:template>
<xsl:template match="bk:sect2">
<xsl:text>
</xsl:text><xsl:value-of select="bk:title/text()"
/> <xsl:text> </xsl:text>
<xsl:apply-templates select="bk:sect3" />
</xsl:template>
<xsl:template match="bk:sect3">
<xsl:text>
</xsl:text><xsl:text>
</xsl:text><xsl:value-of select="bk:title/text()"
/><xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
Explanatory notes :
- Observe the difference between
<xsl:apply-templates
select="//bk:sect1" />
in the first rule
and <xsl:apply-templates select="bk:sect2" />
in
the 2nd one.
select="//bk:sect1"
will get all <bk:sect1>
nodes of the document, whereas
select="bk:sect2"
will get only those <bk:sect1>
nodes that are direct children of the <bk:sect1>
currently dealt with.
- The enigmatic
<xsl:text>
</xsl:text>
means indentation !
-
<xsl:output
method ="html"
/>
means generating HTML-4.
Here is a possible realization of our TOC as a web page : TOC.html
(take care to display the source code
- observe that the DocBook namespace is propagated to the HTML file,
though useless)
This is obtained by
$ xsltproc makeHTOC.xsl IntroDocBookX.xml > TOC.html
Stylesheet makeHTOC.xsl
<?xml version='1.0' ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:bk="http://docbook.org/ns/docbook"
>
<xsl:output method="html" indent="yes" />
<xsl:template match="/">
<html><head><title>Table of
Contents</title></head>
<body>
<ul>
<xsl:apply-templates
select="//bk:sect1" />
</ul>
</body>
</html>
</xsl:template>
<xsl:template match="bk:sect1">
<li>
<xsl:value-of
select="bk:title/text()" />
<ul>
<xsl:apply-templates
select="bk:sect2" />
</ul>
</li>
</xsl:template>
<xsl:template match="bk:sect2">
<li>
<xsl:value-of
select="bk:title/text()" />
<ul>
<xsl:apply-templates
select="bk:sect3" />
</ul>
</li>
</xsl:template>
<xsl:template match="bk:sect3">
<li><xsl:value-of
select="bk:title/text()" /></li>
</xsl:template>
</xsl:stylesheet>
Notes :
- Observe how the calls to
<xsl:apply-templates...>
and to <xsl:value-of....>
are inserted into the HTML framework.
- Also note that those calls are exactly the same as in
the previous stylesheet.
We are performing the same computation, but in a different context.
-
Let us invent an XML format for our TOC, together with a namespace "
http://epita2015.fr/TOC
".
See the file TOC.xml
.
This is obtained by
$ xsltproc makeXTOC.xsl IntroDocBookX.xml > TOC.xml
Stylesheet makeXTOC.xsl
<?xml version='1.0' ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:bk="http://docbook.org/ns/docbook"
xmlns="http://epita2015.fr/TOC"
>
<xsl:output method="xml" indent="yes" />
<xsl:template match="/">
<TOC>
<xsl:apply-templates
select="//bk:sect1" />
</TOC>
</xsl:template>
<xsl:template match="bk:sect1">
<section-1>
<heading><xsl:value-of
select="bk:title/text()" /></heading>
<xsl:apply-templates
select="bk:sect2" />
</section-1>
</xsl:template>
<xsl:template match="bk:sect2">
<section-2>
<heading><xsl:value-of
select="bk:title/text()" /></heading>
<xsl:apply-templates
select="bk:sect3" />
</section-2>
</xsl:template>
<xsl:template match="bk:sect3">
<section-3>
<heading><xsl:value-of
select="bk:title/text()" /></heading>
</section-3>
</xsl:template>
</xsl:stylesheet>
Notes :
- Repeat the observation about the system of xsl calls, as for
HTML : again, same computation in different context.
- Note that the use of a default namespace (here the
TOC namespace) is reserved for the output document.
-
A pdf file will be produced by the
fop
script fom Apache-FOP
according to #3 scheme in fopUse.
DIRFOP=/Users/jfp/Documents/DirXML/XSLT/XSL-FO/fop-1.1/
$DIRFOP/fop -xml IntroDocBookX.xml -xsl makeFOTOC.xsl -pdf TOC.pdf
Stylesheet makeFOTOC.xsl
<?xml version='1.0' ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:bk="http://docbook.org/ns/docbook"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
>
<xsl:output method="xml" indent="yes" />
<xsl:variable name="indent" select="'1'" />
<xsl:template match="/">
<fo:root>
<fo:layout-master-set>
<fo:simple-page-master master-name="menu"
page-height="29.7cm"
page-width="21cm"
margin-top="1cm"
margin-bottom="1cm"
margin-left="1.5cm"
margin-right="0.5cm">
<fo:region-body margin-top="1cm" margin-bottom="0cm" column-count="1"/>
<fo:region-before extent="3cm"/>
<fo:region-after extent="1.5cm"/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="menu">
<fo:flow flow-name="xsl-region-body">
<fo:block text-align="left">
<fo:block font-size="24pt" font-weight="bold"
space-before="12pt" space-after="12pt">
<xsl:text>Table of Contents</xsl:text>
</fo:block>
<xsl:apply-templates select="//bk:sect1" />
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
<xsl:template match="bk:sect1">
<fo:block>
<fo:block font-size="14pt" font-weight="bold" font-style="italic"
space-before="24pt" space-after="6pt">
<xsl:value-of select="bk:title/text()" />
</fo:block>
<fo:block text-indent="{concat($indent, 'cm')}">
<xsl:apply-templates select="bk:sect2" />
</fo:block>
</fo:block>
</xsl:template>
<xsl:template match="bk:sect2">
<fo:block>
<fo:block font-size="14pt"
space-before="12pt" space-after="6pt">
<xsl:value-of select="bk:title/text()" />
</fo:block>
<fo:block text-indent="{concat(($indent+$indent), 'cm')}">
<xsl:apply-templates select="bk:sect3" />
</fo:block>
</fo:block>
</xsl:template>
<xsl:template match="bk:sect3">
<fo:block space-before="6pt" space-after="6pt">
<xsl:value-of select="bk:title/text()" />
</fo:block>
</xsl:template>
</xsl:stylesheet>
Notes :
- Indentation is obtained by the
text-indent
attribute of <fo:block>
, which is supposed to get a numerical value with a length unit (e.g. text-indent="0.5in"
).
- On the notion of
<xsl:variable>
see Variables.
Exercise : use a variable to facilitate changing the indentation in the text version (#1).
- On the bizarre syntax
text-indent="{....}"
see #3 in Generating XML.