








The purpose of DSSSL is to standardize the specification of transformations of SGML documents. It is composed of two parts: tree transformation specifications and style transformation specifications.
The figure shows the high level structure of a DSSSL transformation.
DSSSL Transformation Overview
The tree transformation process takes one or more SGML documents as input and produces one or more SGML documents as output. The details of the tree transformation process are beyond the scope of this article.
The style transformation process takes one or more SGML documents as input and produces one or more formatted documents as output. Both the tree transformation process and the style transformation process can be used alone or with the other. This article will only describe the style transformation part of DSSSL.
The DSSSL expression language is the underlying language in which all other parts of DSSSL specifications are expressed. The DSSSL expression language was inspired by the Scheme programming language invented by Guy Lewis Steele, Jr. and Gerald Jay Sussman and originally described in 1975.
While the DSSSL expression language was inspired by Scheme, it is not completely syntactically compatible with it. The expression language uses only a side-effect free subset of the full Scheme programming language. Also it adds a few things like keyword arguments, quantities and several syntactic forms which are not part of Scheme.
The current DSSSL expression language is not (as I shall expand on below) critical to the DSSSL language as a whole; several other languages would suit as well and may eventually end up replacing the current Scheme inspired one.
The term 'grove' is an acronym for Graph Representation Of property ValuEs. A grove is a representation of a parsed SGML document that contains all of the information required to process it. The DSSSL standard defines a grove as follows (p. 76):
"A grove is a set of nodes constructed according to a grove plan. Every node in the grove belongs to a named class in the grove plan. A node is a set of property assignments, each consisting of a property name and a property value."
Groves are constructed according to a grove plan which is a subset of the complete set of properties associated with an SGML document. For example, the SGML property set defines a collection of node classes and property values that capture the essence of an SGML document without including irrelevant information, such as, the exact whitespace characters present in start- or end-tags, tokenized attribute values or between elements in element content. The grove plan required for a different type of application, say an SGML editor, might be different from the grove plan required to create formatted output from an SGML document.
Henry Thompson of the University of Edinburgh created a picture of a part of the grove produced by the following SGML document:
<!doctype simp [
<!element simp o o (bit*)>
<!element bit - - (#PCDATA)>
<!attlist bit name id #required
]>
<bit name=one>1</bit>
<bit name=two>2</bit>
Figure. A picture of a small grove. Copyright © 1997 Henry Thompson. License to reproduce for any educational purpose is hereby granted, provided only that this copyright notice is reproduced in full as a part of any reproduction.
Groves are important in DSSSL because they are the logical starting point for both DSSSL tree and style transformations.
Standard Document Query Language (SDQL) is an API expressed in the DSSSL expression language for accessing parts of groves. It adds two new types to those provided by the basic expression language: node-list and named-node-list. A node-list contains zero or more nodes from a grove. A named-node-list is a node-list each of whose members may be accessed using a unique name. There is no separate node type-an individual node is represented as a node-list containing a single member.
SDQL defines a number of standard procedures that can be used to access nodes in a grove:
· (current-node) returns a singleton node-list containing the node that is currently being processed,
· (children nl) returns a node-list consisting of all of the children of the current node (e.g., for an element with element content all of the nodes for the elements that make up its content, for an element with #PCDATA content a node-list containing a node for each data character in its content, etc.),
· (data nl) returns a string that is formed by recursively concatenating all of the characters in the data content of each member of nl,
· (gi nl) returns the GI of the element represented by nl (or #f if nl does not represent an element),
· (child-number nl) returns the child number of the supplied node where child number is one plus the number of elements with the same GI as the node that are siblings of the node and precede it, etc.
The style transformation part of DSSSL takes one or more SGML documents as input and produces a
<< flow object tree>> as output. A list of flow object classes is provided in the DSSSL standard (p. 198).
Flow objects correspond to composition or output oriented objects, such cs, pages, paragraphs, tables, equations, characters, hyperlinks, etc. A DSSSL style specification says how objects in the source document should be transformed by the DSSSL style engine into flow objects.
DSSSL doesn't actually specify how these flow object classes should be rendered for any particular output format, such as, RTF. In Jade the transformation of the flow objects to actual output languages is handled by something called a backend. Jade currently provides backends for: RTF, TeX, and SGML (which relies on non-standard flow object classes).
Flow objects may have characteristics. For example, a paragraph has a font family name, font size, first line start indent, leading, etc. Each flow object class has a set of characteristics that can be specified for flow objects of its class.
As we saw in the example at the beginning of this article, a DSSSL style specification may have construction rules that specify how to map items in the source document to flow objects in the output. There are several types of construction rules that apply to different types of nodes in the source document:
A root construction rule applies to the root of the document (one level above the document level element). A root construction rule can be used to produce a flow object that encompasses all of the document regardless of the element type of the document level element.
An element construction rule applies to a named element type. The rule can just name an element or it can specify the context of the element (e.g., (element (chapter title)) applies to title elements when their parent is chapter).
A query construction rule allows you to specify an SDQL expression as a condition. The rule applies to any node that satisfies the SDQL query. Note that Jade does not currently support this feature.
A default construction rule applies to any node in the grove that is not matched by a more specific rule, such as, an element or query construction rule. The following default construction rule is automatically supplied if no actual default construction rule is specified:
(default (process-children))
This provides the expected behaviour when processing a document without requiring the user to specify construction rules for all items in the source document.
Contact Robin Cover with corrections and updates, or to submit contributions to the ISUG online document database.
