Checking DocBook documents for grammatical correctness.

Jean-François Perrot

  1. Purpose
  2. Download the reference grammar
  3. Checking with xmllint
  4. Checking with jing
    1. Download jing
    2. Write a script to operate jing comfortably
    3. Executing "sh docjing.sh myFile.xml"  will yield
    4. A word of wisdom...


  1. Purpose

  2. Download the reference grammar

    http://docs.oasis-open.org/docbook/rng/5.0/docbook.rng
    (RelaxNG in XML format - the compact format is not directly usable).

    call it docbook.rng and store it somewhere : /My/.../Path/to/docbook.rng

  3. Checking with xmllint

    xmllint --noout --relaxng /My/.../Path/to/docbook.rng myFile.xml
    yields

  4. Checking with jing

    1. Download jing

      http://code.google.com/p/jing-trang/
      and store jing.jar somewhere : /The/.../Path/to/jing-20091111/bin/jing.jar

    2. Write a script to operate jing comfortably

      - suppose you call it docjing.sh

      #checking  DocBook files against the RNG Grammar
      JingJar=/The/.../Path/to/jing-20091111/bin/jing.jar
      DocGram=/My/.../Path/to/docbook.rng
      java -jar $JingJar $DocGram $1


    3. Executing "sh docjing.sh myFile.xml"  will yield

      • nothing if myFile is indeed valid

      • a rather verbose but usable error message in the opposite case :

        jfp$ sh docjing.sh HelloDoc.xml
        /Users/jfp/Sites/EPITA/International/Site2015b/Session5/DocBook/HelloDoc.xml:23:34: error: text not allowed here; expected element "address", "anchor", "annotation", "bibliolist", "blockquote", "bridgehead", "calloutlist", "caution", "classsynopsis", "cmdsynopsis", "constraintdef", "constructorsynopsis", "destructorsynopsis", "epigraph", "equation", "example", "fieldsynopsis", "figure", "formalpara", "funcsynopsis", "glosslist", "important", "indexterm", "info", "informalequation", "informalexample", "informalfigure", "informaltable", "itemizedlist", "literallayout", "mediaobject", "methodsynopsis", "msgset", "note", "orderedlist", "para", "procedure", "productionset", "programlisting", "programlistingco", "qandaset", "remark", "revhistory", "screen", "screenco", "screenshot", "segmentedlist", "sidebar", "simpara", "simplelist", "synopsis", "table", "task", "tip", "variablelist" or "warning"
        /Users/jfp/Sites/EPITA/International/Site2015b/Session5/DocBook/HelloDoc.xml:23:44: error: element "caption" incomplete; expected element "address", "anchor", "annotation", "bibliolist", "blockquote", "bridgehead", "calloutlist", "caution", "classsynopsis", "cmdsynopsis", "constraintdef", "constructorsynopsis", "destructorsynopsis", "epigraph", "equation", "example", "fieldsynopsis", "figure", "formalpara", "funcsynopsis", "glosslist", "important", "indexterm", "info", "informalequation", "informalexample", "informalfigure", "informaltable", "itemizedlist", "literallayout", "mediaobject", "methodsynopsis", "msgset", "note", "orderedlist", "para", "procedure", "productionset", "programlisting", "programlistingco", "qandaset", "remark", "revhistory", "screen", "screenco", "screenshot", "segmentedlist", "sidebar", "simpara", "simplelist", "synopsis", "table", "task", "tip", "variablelist" or "warning"
        jfp$



        Note that line 23 of our text reads "<caption>This is the OASIS Logo</caption>".
        From the double diagnosis "text not allowed here;" and "element "caption" incomplete;" we deduce that a wrapper is needed (as is very often the case),
        for instance "<para>".
        Indeed, "<caption><para>This is the OASIS Logo</para></caption>" proves to be correct.

    4. A word of wisdom...

      Do check your files at regular intervals while writing.
      Do not wait until you get a mass of unreadable error messages!