Generating a Table of Contents from DocBook

with XSLT

Demo in class on April 1st, 2015.

  1. Printout
    1. First step : <sect1> only
    2. 2nd step : the whole 3 level TOC

  2. HTML

  3. XML

  4. XSL-FO



  1. Printout

    We start with exactly the same purpose as the one we achieved with XPath in XP-TOC :
    write a stylesheet MakeTOC.xsl such that

    $xsltproc MakeTOC.xsl IntroDocBookX.xml
    1.Principle
        1.1. What is DocBook ?
            1.1.1. Standards
            1.1.2. References
        2 1.2. Main features
    2.Stylesheets
        2.1. Availability
        2.2. Usage through customization
        2.3. Use a CSS to easily customize HTML output
        2.4. The basic technique is overriding imported rules


    1. First step : <sect1> only

      Intended execution

      xsltproc makeTOC-1.xsl IntroDocBookX.xml
      Table of Contents
      1. Principle
      2. Stylesheets
      $



      Stylesheet makeTOC-1.xsl
      <?xml version='1.0' ?>
      <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
        xmlns:bk="http://docbook.org/ns/docbook"
        >

      <xsl:output method="text"/>

      <xsl:template match="/">
      Table of Contents
      <xsl:apply-templates select="//bk:sect1" />
      </xsl:template>

      <xsl:template match="bk:sect1">
          <xsl:value-of select="bk:title/text()" /><xsl:text>&#10;</xsl:text>
      </xsl:template>

      </xsl:stylesheet>



      Explanatory notes (in addition to the main Introduction)

      • We want a printout, i.e. producing pure text, hence the choice of
        <xsl:output method="text"/>
        (as opposed to method="html" or method="xml" that we shall see later).

      • Note the use of <xsl:apply-templates select="...." />
        necessary to prevent the default rules from applying to the many nodes of our document
        of which we don't wish to take account.

      • The bizarre construct <xsl:text>&#10;</xsl:text> is indeed the best way to specify a line-feed !
        (Remember, '\n' = ASCII #10...)
        Note that the actual execution gives a result that is slightly different from expected :

        xsltproc makeTOC-1.xsl IntroDocBookX.xml

        Table of Contents
        1. Principle
        2. Stylesheets
        $


        with an intial linefeed that was not asked for... Howzat?

    2. 2nd step : the whole 3 level TOC

      Stylesheet makeTOC-2.xsl
      <?xml version='1.0' ?>
      <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
        xmlns:bk="http://docbook.org/ns/docbook"
        >

      <xsl:output method="text"/>

      <xsl:template match="/">
      Table of Contents
      <xsl:apply-templates select="//bk:sect1" />
      </xsl:template>

      <xsl:template match="bk:sect1">
          <xsl:value-of select="bk:title/text()" /><xsl:text>&#10;</xsl:text>
          <xsl:apply-templates select="bk:sect2" />
      </xsl:template>

      <xsl:template match="bk:sect2">
          <xsl:text>   </xsl:text><xsl:value-of select="bk:title/text()" /> <xsl:text>&#10;</xsl:text>
          <xsl:apply-templates select="bk:sect3" />
      </xsl:template>

      <xsl:template match="bk:sect3">
          <xsl:text>   </xsl:text><xsl:text>   </xsl:text><xsl:value-of select="bk:title/text()" /><xsl:text>&#10;</xsl:text>
      </xsl:template>

      </xsl:stylesheet>



      Explanatory notes :

      • Observe the difference between <xsl:apply-templates select="//bk:sect1" /> in the first rule
        and <xsl:apply-templates select="bk:sect2" /> in the 2nd one.
        • select="//bk:sect1" will get all <bk:sect1> nodes of the document, whereas
        • select="bk:sect2" will get only those <bk:sect1> nodes that are direct children of the <bk:sect1> currently dealt with.

      • The enigmatic <xsl:text>   </xsl:text> means indentation !

  2. HTML

    <xsl:output method ="html"/> means generating HTML-4.
    Here is a possible realization of our TOC as a web page : TOC.html
    (take care to display the source code
    - observe that the DocBook namespace is propagated to the HTML file, though useless)

    This is obtained by
    $ xsltproc makeHTOC.xsl IntroDocBookX.xml > TOC.html

    Stylesheet makeHTOC.xsl

    <?xml version='1.0' ?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
      xmlns:bk="http://docbook.org/ns/docbook"
      >

    <xsl:output method="html" indent="yes" />

    <xsl:template match="/">
    <html><head><title>Table of Contents</title></head>
    <body>
        <ul>
            <xsl:apply-templates select="//bk:sect1" />
        </ul>
    </body>
    </html>
    </xsl:template>

    <xsl:template match="bk:sect1">
        <li>
            <xsl:value-of select="bk:title/text()" />
            <ul>
                <xsl:apply-templates select="bk:sect2" />
            </ul>
        </li>
    </xsl:template>

    <xsl:template match="bk:sect2">
        <li>
            <xsl:value-of select="bk:title/text()" />
            <ul>
                <xsl:apply-templates select="bk:sect3" />
            </ul>
        </li>
    </xsl:template>

    <xsl:template match="bk:sect3">
        <li><xsl:value-of select="bk:title/text()" /></li>
    </xsl:template>

    </xsl:stylesheet>



    Notes :


  3. XML

    Let us invent an XML format for our TOC, together with a namespace "http://epita2015.fr/TOC".
    See the file TOC.xml.

    This is obtained by
    $ xsltproc makeXTOC.xsl IntroDocBookX.xml > TOC.xml

    Stylesheet makeXTOC.xsl

    <?xml version='1.0' ?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
      xmlns:bk="http://docbook.org/ns/docbook"
      xmlns="http://epita2015.fr/TOC"
      >

    <xsl:output method="xml" indent="yes" />

    <xsl:template match="/">
    <TOC>
            <xsl:apply-templates select="//bk:sect1" />
    </TOC>
    </xsl:template>

    <xsl:template match="bk:sect1">
        <section-1>
            <heading><xsl:value-of select="bk:title/text()" /></heading>
            <xsl:apply-templates select="bk:sect2" />
        </section-1>
    </xsl:template>

    <xsl:template match="bk:sect2">
        <section-2>
            <heading><xsl:value-of select="bk:title/text()" /></heading>
            <xsl:apply-templates select="bk:sect3" />
        </section-2>
    </xsl:template>

    <xsl:template match="bk:sect3">
        <section-3>
            <heading><xsl:value-of select="bk:title/text()" /></heading>
        </section-3>
    </xsl:template>

    </xsl:stylesheet>



    Notes :

  4. XSL-FO

    A pdf file will be produced by the fop script fom Apache-FOP according to #3 scheme in fopUse.

    DIRFOP=/Users/jfp/Documents/DirXML/XSLT/XSL-FO/fop-1.1/
    $DIRFOP/fop -xml IntroDocBookX.xml -xsl makeFOTOC.xsl -pdf TOC.pdf


    Stylesheet makeFOTOC.xsl

    <?xml version='1.0' ?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
      xmlns:bk="http://docbook.org/ns/docbook"
      xmlns:fo="http://www.w3.org/1999/XSL/Format"
      >

    <xsl:output method="xml" indent="yes" />

    <xsl:variable name="indent" select="'1'" />

    <xsl:template match="/">
    <fo:root>
      <fo:layout-master-set>
         <fo:simple-page-master master-name="menu"
                      page-height="29.7cm"
                      page-width="21cm"
                      margin-top="1cm"
                      margin-bottom="1cm"
                      margin-left="1.5cm"
                      margin-right="0.5cm">
          <fo:region-body margin-top="1cm" margin-bottom="0cm" column-count="1"/>
          <fo:region-before extent="3cm"/>
          <fo:region-after extent="1.5cm"/>
        </fo:simple-page-master>
      </fo:layout-master-set>
     
      <fo:page-sequence master-reference="menu">
        <fo:flow flow-name="xsl-region-body">
       
           <fo:block text-align="left">
               <fo:block font-size="24pt" font-weight="bold"
                         space-before="12pt" space-after="12pt">
              <xsl:text>Table of Contents</xsl:text>
               </fo:block>
               <xsl:apply-templates select="//bk:sect1" />

           </fo:block>
       
        </fo:flow>
      </fo:page-sequence>
    </fo:root>

    </xsl:template>


    <xsl:template match="bk:sect1">
     <fo:block>
         <fo:block font-size="14pt" font-weight="bold"  font-style="italic"
            space-before="24pt" space-after="6pt">
            <xsl:value-of select="bk:title/text()" />
         </fo:block>
         <fo:block text-indent="{concat($indent, 'cm')}">
            <xsl:apply-templates select="bk:sect2" />
         </fo:block>
     </fo:block>
    </xsl:template>

    <xsl:template match="bk:sect2">
    <fo:block>
         <fo:block font-size="14pt"
            space-before="12pt" space-after="6pt">
            <xsl:value-of select="bk:title/text()" />
         </fo:block>
        
         <fo:block text-indent="{concat(($indent+$indent), 'cm')}">
            <xsl:apply-templates select="bk:sect3" />
        </fo:block>
    </fo:block>
    </xsl:template>

    <xsl:template match="bk:sect3">
        <fo:block space-before="6pt" space-after="6pt">
            <xsl:value-of select="bk:title/text()" />
         </fo:block>
    </xsl:template>

    </xsl:stylesheet>



    Notes :