Representing trees as text files

Jean-François Perrot

  1. Character Encoding
  2. Tree encoding
  3. Eclipse tools
  4. Visualizing the tree structure
  5. A word about DTDs (Document Type Definitions)

  1. Character Encoding

    Texts are made up of characters, not bytes !
    Usually character encoding is not indicated in text files - but in XML encoding ie always given.

  2. Tree encoding

    XML followthe the tradition of markup languages [Wikipedia] - as opposed to JSON & YAML.

    i.e. it uses a parenthesis systems (tags) extended with
    See the example of the three levels for representing cars
    1. Tags only
    2. Tags with attributes
    3. Tags with attributes and textual content

    Other example : different ways to represent a system of names & marks.

  3. Eclipse tools

    demo
  4. Visualizing the tree structure

    Nodes may be collapsed or extended.

  5. A word about DTDs (Document Type Definitions)

    1. A way of specifying tree structure

      • inherited from SGML the common ancestor of all markup languages [Wikipedia].
      • superseded (see session #3)
      • but still widely used

      See [Wikipedia] for details.

      Note that a DTDs is an integral part of the structure of the XML document,
      whereas XML schemas or Relax NG grammars are linked to the file by means of an ordinary attribute.
      This is due to inheritance from SGML, and adds quite substantially to the complexity of XML  programming,
      as we shall see later (DOM).

    2. Example : XML file  DTD file


      <?xml version="1.0" encoding="UTF-8"?>
      <!-- DTD for Cars -->
      <!ELEMENT Car (Body, Engine, Transmission)>
      <!ATTLIST Car make CDATA #REQUIRED>
      <!ATTLIST Car model CDATA #REQUIRED>

      <!ELEMENT Body (Hood)>
      <!ATTLIST Body color  CDATA #REQUIRED>
      <!ELEMENT Hood (#PCDATA)>

      <!ELEMENT Engine (Cylinders, Ignition)>
      <!ELEMENT Cylinders EMPTY>
      <!ELEMENT Ignition (#PCDATA)>

      <!ELEMENT Transmission (GearBox, FrontAxle, RearAxle)>
      <!ATTLIST Transmission type (automatic | manual) #REQUIRED>
      <!ATTLIST Transmission gear_nb (3 | 4 | 5) #REQUIRED>

      <!ELEMENT
      GearBox EMPTY>
      <!ELEMENT
      FrontAxle EMPTY>
      <!ELEMENT
      RearAxle EMPTY>


    3. Validation

      • by the W3C Validator http://validator.w3.org/
        (for a publicly accessible DTD file that  is referred from the XML)

      • with Eclipse