1 SXML
SXML is a representation of XML elements using unique s-expressions. The following grammar describes the structure of SXML:
top | = |
| ||||||
element | = | (name maybe-annot-attributes child ...) | ||||||
annot-attributes | = | (@ attribute ... maybe-annotations) | ||||||
attribute | = | (name maybe-value maybe-annotations) | ||||||
child | = | element | ||||||
| | character-data-string | |||||||
| | PI | |||||||
| | comment | |||||||
| | entity | |||||||
PI | = |
| ||||||
comment | = | (*COMMENT* comment-string) | ||||||
entity | = | (*ENTITY* public-id-string system-id-string) | ||||||
name | = | local-name | ||||||
| | exp-name | |||||||
local-name | = | symbol conforming to XML Namespace recommendation | ||||||
exp-name | = | symbol of the form namespace-id:local-name | ||||||
namespace-id | = | URI-symbol | ||||||
| | user-ns-shortcut-symbol | |||||||
namespaces | = | (*NAMESPACES* namespace-assoc ...) | ||||||
namespace-assoc | = | (namespace-id uri-string maybe-original-prefix) | ||||||
annotations | = | (@ maybe-namespaces annotation ...) | ||||||
annotation | = | not yet specified |
Some tools, such as SXPath, use the following coarse approximation of SXML structure for simplicity:
node | = | (name . node-list) | ||
| | string | |||
node-list | = | (node ...) | ||
name | = | local-name | ||
| | exp-name | |||
| | @ | |||
| | *TOP* | |||
| | *PI* | |||
| | *COMMENT* | |||
| | *ENTITY* | |||
| | *NAMESPACES* |
In short, an XML element is represented as a list consisting of its tag name as a symbol followed by its children nodes. If the XML element has attributes, they come immediately after the tag symbol, in a list tagged by an @ symbol.
For example, the XML element
<abc>def<ghi />jkl</abc>
is represented by the SXML datum
'(abc "def" (ghi) "jkl")
and the XML element
<customer specialness="gazonga">Barry White</customer>
is represented by the SXML datum
'(customer (@ (specialness "gazonga")) "Barry White")
NOTE! Some of the sxml libraries, particularly sxml:modify, depend on the fact that sxml elements in a legal document are all "unique"; as I understand it, the requirement is that no two subtrees of a given SXML document can be ’eq?’ to each other. This can easily occur when rewriting a tree, for instance a pass that inserts ‘(delete-me) in multiple places.
That’s the easy part. Things get more tricky when you start talking about documents and namespaces.
Refer to the original SXML specification for a more detailed explanation of the representation, including examples.
1.1 SXML Functions
procedure
(sxml:element? v) → boolean?
v : any/c
> (sxml:element? '(p "blah")) #t
> (sxml:element? '(*COMMENT* "ignore me")) #f
> (sxml:element? '(@ (href "link.html"))) #f
> ((ntype-names?? '(a p)) '(p "blah")) '(p)
> ((ntype-names?? '(a p)) '(br)) #f
'@: an annot-attributes node
'*: any element (sxml:element?)
'*any*: any node
'*text*: any string
'*data*: anything except a pair (element)
'*COMMENT*: a comment node
'*PI*: a PI (processing instruction) node
'*ENTITY*: an entity node
Otherwise, it is an ordinary tag name, and a predicate is returned that recognizes elements with that tag.
> ((ntype?? '*) "blah") #f
> ((ntype?? '*) '(p "blah")) #t
> ((ntype?? '*text*) "blah") #t
> ((ntype?? '*text*) '(p "blah")) #f
> ((ntype-namespace-id?? "atom") '(atom:id "blah")) #t
> ((ntype-namespace-id?? "atom") '(atomic "section")) #f
> ((ntype-namespace-id?? #f) '(atomic "section")) #t
procedure
(sxml:node? v) → boolean?
v : any/c
Note that the set of values accepted by sxml:node? is different from the non-terminal node.
> (sxml:node? '(a (@ (href "link.html")) "blah")) #t
> (sxml:node? '(@ (href "link.html"))) #f
procedure
(sxml:attr-list node) → (listof attribute)
node : node
> (sxml:attr-list '(a (@ (href "link.html")) "blah")) '((href "link.html"))
> (sxml:attr-list '(p "blah")) '()
> (sxml:attr-list "blah") '()
procedure
(sxml:attr-list-node elem)
→ (or/c #f (cons/c '@ (listof attribute))) elem : sxml:element?
> (sxml:attr-list-node '(a (@ (href "link.html")) "blah")) '(@ (href "link.html"))
> (sxml:attr-list-node '(p "blah")) #f
procedure
(sxml:empty-element? elem) → boolean?
elem : sxml:element?
> (sxml:empty-element? '(br)) #t
> (sxml:empty-element? '(p "blah")) #t
> (sxml:empty-element? '(link (@ (rel "self") (href "here.html")))) #t
procedure
(sxml:element-name elem) → symbol?
elem : sxml:element?
procedure
(sxml:ncname qualified-name) → string?
qualified-name : symbol?
procedure
(sxml:name->ns-id qualified-name) → (or/c string? #f)
qualified-name : symbol?
procedure
(sxml:content node-or-nodeset) → (listof node)
node-or-nodeset : (or/c node nodeset?)
> (sxml:text '(p (em "red") " fish; " (em "blue") " fish")) " fish; fish"
procedure
elem : sxml:element? attr-name : symbol?
procedure
(sxml:change-content elem new-content) → sxml:element?
elem : sxml:element? new-content : (listof child)
procedure
(sxml:change-attrlist elem new-attrlist) → sxml:element?
elem : sxml:element? new-attrlist : (listof attribute)
procedure
(sxml:change-name elem tag) → sxml:element?
elem : sxml:element? tag : symbol?
procedure
(sxml:set-attr elem attr) → sxml:element?
elem : sxml:element? attr : (list/c symbol? any/c)
procedure
(sxml:add-attr elem attr) → (or/c sxml:element? #f)
elem : sxml:element? attr : (list/c symbol? any/c)
procedure
(sxml:change-attr elem attr) → (or/c sxml:element? #f)
elem : sxml:element? attr : (list/c symbol? any/c)
procedure
(sxml:squeeze elem) → sxml:element?
elem : sxml:element?
procedure
(sxml:clean elem) → sxml:element?
elem : sxml:element?