txexpr: Tagged X-expressions
(require txexpr) | package: txexpr |
(require (submod txexpr safe)) |
A set of small but handy functions for improving the readability and reliability of programs that operate on tagged X-expressions (for short, txexprs).
1 Installation
raco pkg install txexpr |
raco pkg update txexpr |
2 Importing the module
The module can be invoked two ways: fast or safe.
Fast mode is the default, which you get by importing the module in the usual way: (require txexpr).
Safe mode enables the function contracts documented below. Use safe mode by importing the module as (require (submod txexpr safe)).
3 What’s a txexpr?
It’s an X-expression with the following grammar:
txexpr | = | (list tag (list attr ...) element ...) | ||
| | (cons tag (list element ...)) | |||
tag | = | symbol? | ||
attr | = | (list key value) | ||
key | = | symbol? | ||
value | = | string? | ||
element | = | xexpr? |
A txexpr is a list with a symbol in the first position — the tag — followed by a series of elements, which are other X-expressions. Optionally, a txexpr can have a list of attributes in the second position.
> (txexpr? '(span "Brennan" "Dale")) #t
> (txexpr? '(span "Brennan" (em "Richard") "Dale")) #t
> (txexpr? '(span [[class "hidden"][id "names"]] "Brennan" "Dale")) #t
> (txexpr? '(span lt gt amp)) #t
> (txexpr? '("We really" "should have" "a tag")) #f
> (txexpr? '(span [[class not-quoted]] "Brennan")) #f
> (txexpr? '(span [class "hidden"] "Brennan" "Dale")) #t
The last one is a common mistake. Because the key–value pair is not enclosed in a list, it’s interpreted as a nested txexpr within the first txexpr, as you may not find out until you try to read its attributes:
There’s no way of eliminating this ambiguity, short of always requiring an attribute list — empty if necessary — in your txexpr. See also xexpr-drop-empty-attributes.
> (get-attrs '(span [class "hidden"] "Brennan" "Dale")) '()
> (get-elements '(span [class "hidden"] "Brennan" "Dale")) '((class "hidden") "Brennan" "Dale")
Tagged X-expressions are most commonly found in HTML & XML documents. Though the notation is different in Racket, the data structure is identical:
> (xexpr->string '(span [[id "names"]] "Brennan" (em "Richard") "Dale")) "<span id=\"names\">Brennan<em>Richard</em>Dale</span>"
> (string->xexpr "<span id=\"names\">Brennan<em>Richard</em>Dale</span>") '(span ((id "names")) "Brennan" (em () "Richard") "Dale")
After converting to and from HTML, we get back the original X-expression. Well, almost. The brackets turned into parentheses — no big deal, since they mean the same thing in Racket. Also, per its usual practice, string->xexpr added an empty attribute list after em. This is also benign.
4 Why not just use match, quasiquote, and so on?
If you prefer those, please do. But I’ve found two benefits to using module functions:
Readability. In code that already has a lot of matching and quasiquoting going on, these functions make it easy to see where & how txexprs are being used.
Reliability. Because txexprs come in two close but not quite equal forms, careful coders will always have to take both cases into account.
The programming is trivial, but the annoyance is real.
5 Interface
procedure
v : any/c
procedure
(txexpr-tag? v) → boolean?
v : any/c
procedure
(txexpr-attr? v) → boolean?
v : any/c
procedure
(txexpr-attr-key? v) → boolean?
v : any/c
procedure
(txexpr-attr-value? v) → boolean?
v : any/c
procedure
(txexpr-element? v) → boolean?
v : any/c
txexpr | = | (list tag (list attr ...) element ...) | ||
| | (cons tag (list element ...)) | |||
tag | = | symbol? | ||
attr | = | (list key value) | ||
key | = | symbol? | ||
value | = | string? | ||
element | = | xexpr? |
procedure
(txexpr-tags? v) → boolean?
v : any/c
procedure
(txexpr-attrs? v) → boolean?
v : any/c
procedure
(txexpr-elements? v) → boolean?
v : any/c
procedure
(validate-txexpr possible-txexpr) → txexpr?
possible-txexpr : any/c
> (validate-txexpr 'root) validate-txexpr: 'root: not an X-expression
> (validate-txexpr '(root)) '(root)
> (validate-txexpr '(root ((id "top")(class 42)))) validate-txexpr-attrs: in '(root ((id "top") (class 42))),
'((id "top") (class 42)) is not a valid list of attributes
because '(class 42) is not in the form '(symbol "string")
> (validate-txexpr '(root ((id "top")(class "42")))) '(root ((id "top") (class "42")))
> (validate-txexpr '(root ((id "top")(class "42")) ("hi"))) validate-txexpr-element: in '(root ((id "top") (class "42"))
("hi")), '("hi") is not a valid element (must be txexpr,
string, symbol, XML char, or cdata)
> (validate-txexpr '(root ((id "top")(class "42")) "hi")) '(root ((id "top") (class "42")) "hi")
procedure
v : can-be-txexpr-attr-key?
procedure
v : can-be-txexpr-attr-value?
procedure
(txexpr->values tx)
→
txexpr-tag? txexpr-attrs? txexpr-elements? tx : txexpr?
> (txexpr->values '(div))
'div
'()
'()
> (txexpr->values '(div "Hello" (p "World")))
'div
'()
'("Hello" (p "World"))
> (txexpr->values '(div [[id "top"]] "Hello" (p "World")))
'div
'((id "top"))
'("Hello" (p "World"))
procedure
(txexpr->list tx) →
(list txexpr-tag? txexpr-attrs? txexpr-elements?) tx : txexpr?
> (txexpr->list '(div)) '(div () ())
> (txexpr->list '(div "Hello" (p "World"))) '(div () ("Hello" (p "World")))
> (txexpr->list '(div [[id "top"]] "Hello" (p "World"))) '(div ((id "top")) ("Hello" (p "World")))
procedure
(xexpr->html x) → string?
x : xexpr?
> (define tx '(root (script "3 > 2") "Why is 3 > 2?"))
> (xexpr->string tx) "<root><script>3 > 2</script>Why is 3 > 2?</root>"
> (xexpr->html tx) "<root><script>3 > 2</script>Why is 3 > 2?</root>"
> (map xexpr->html (list "string" 'entity 65)) '("string" "&entity;" "A")
procedure
(get-tag tx) → txexpr-tag?
tx : txexpr?
procedure
(get-attrs tx) → txexpr-attr?
tx : txexpr?
procedure
(get-elements tx) → (listof txexpr-element?)
tx : txexpr?
> (get-tag '(div [[id "top"]] "Hello" (p "World"))) 'div
> (get-attrs '(div [[id "top"]] "Hello" (p "World"))) '((id "top"))
> (get-elements '(div [[id "top"]] "Hello" (p "World"))) '("Hello" (p "World"))
procedure
tag : txexpr-tag? attrs : txexpr-attrs? = empty elements : txexpr-elements? = empty
> (txexpr 'div) '(div)
> (txexpr 'div '() '("Hello" (p "World"))) '(div "Hello" (p "World"))
> (txexpr 'div '[[id "top"]]) '(div ((id "top")))
> (txexpr 'div '[[id "top"]] '("Hello" (p "World"))) '(div ((id "top")) "Hello" (p "World"))
> (define tx '(div [[id "top"]] "Hello" (p "World")))
> (txexpr (get-tag tx) (get-attrs tx) (get-elements tx)) '(div ((id "top")) "Hello" (p "World"))
procedure
(make-txexpr tag [attrs elements]) → txexpr?
tag : txexpr-tag? attrs : txexpr-attrs? = empty elements : txexpr-elements? = empty
procedure
(can-be-txexpr-attrs? v) → boolean?
v : any/c
procedure
(attrs->hash x ...) → hash-eq?
x : can-be-txexpr-attrs?
procedure
(hash->attrs h) → txexpr-attrs?
h : hash?
> (define tx '(div [[id "top"][class "red"]] "Hello" (p "World")))
> (attrs->hash (get-attrs tx)) '#hasheq((class . "red") (id . "top"))
> (hash->attrs '#hasheq((class . "red") (id . "top"))) '((class "red") (id "top"))
procedure
(attrs-have-key? attrs key) → boolean?
attrs : (or/c txexpr-attrs? txexpr?) key : can-be-txexpr-attr-key?
> (define tx '(div [[id "top"][class "red"]] "Hello" (p "World")))
> (attrs-have-key? tx 'id) #t
> (attrs-have-key? tx 'grackle) #f
procedure
(attrs-equal? attrs other-attrs) → boolean?
attrs : (or/c txexpr-attrs? txexpr?) other-attrs : (or/c txexpr-attrs? txexpr?)
> (define tx1 '(div [[id "top"][class "red"]] "Hello"))
> (define tx2 '(p [[class "red"][id "top"]] "Hello"))
> (define tx3 '(p [[id "bottom"][class "red"]] "Hello"))
> (attrs-equal? tx1 tx2) #t
> (attrs-equal? tx1 tx3) #f
procedure
(attr-ref tx key) → can-be-txexpr-attr-value?
tx : txexpr? key : can-be-txexpr-attr-key?
> (attr-ref tx 'class) "red"
> (attr-ref tx 'id) "top"
> (attr-ref tx 'nonexistent-key) attr-ref: no value found for key 'nonexistent-key
procedure
(attr-ref* tx key) → (listof can-be-txexpr-attr-value?)
tx : txexpr? key : can-be-txexpr-attr-key?
> (define tx '(div [[class "red"]] "Hello" (em ([class "blue"]) "world")))
> (attr-ref* tx 'class) '("red" "blue")
> (attr-ref* tx 'nonexistent-key) '()
procedure
tx : txexpr? key : can-be-txexpr-attr-key? value : can-be-txexpr-attr-value?
> (define tx '(div [[class "red"][id "top"]] "Hello" (p "World")))
> (attr-set tx 'id "bottom") '(div ((class "red") (id "bottom")) "Hello" (p "World"))
> (attr-set tx 'class "blue") '(div ((class "blue") (id "top")) "Hello" (p "World"))
> (attr-set (attr-set tx 'id "bottom") 'class "blue") '(div ((class "blue") (id "bottom")) "Hello" (p "World"))
procedure
tx : txexpr? key : can-be-txexpr-attr-key? value : can-be-txexpr-attr-value?
> (define tx '(div "Hello"))
> (attr-set* tx 'id "bottom" 'class "blue") '(div ((class "blue") (id "bottom")) "Hello")
procedure
tx : txexpr? key : can-be-txexpr-attr-key? value : can-be-txexpr-attr-value?
> (define tx '(div [[class "red"]] "Hello"))
> (attr-join tx 'class "small") '(div ((class "red small")) "Hello")
procedure
(merge-attrs attrs ...) → txexpr-attrs?
attrs : (listof can-be-txexpr-attrs?)
You can pass the attributes in multiple forms. See can-be-txexpr-attrs? for further details.
Attributes with the same name are merged, with the later value taking precedence (i.e., hash behavior).
Attributes are sorted in alphabetical order.
> (define tx '(div [[id "top"][class "red"]] "Hello" (p "World")))
> (define tx-attrs (get-attrs tx))
> tx-attrs '((id "top") (class "red"))
> (merge-attrs tx-attrs 'editable "true") '((class "red") (editable "true") (id "top"))
> (merge-attrs tx-attrs 'id "override-value") '((class "red") (id "override-value"))
> (define my-attr '(id "another-override"))
> (merge-attrs tx-attrs my-attr) '((class "red") (id "another-override"))
> (merge-attrs my-attr tx-attrs) '((class "red") (id "top"))
procedure
(remove-attrs tx) → txexpr?
tx : txexpr?
> (define tx '(div [[id "top"]] "Hello" (p [[id "lower"]] "World")))
> (remove-attrs tx) '(div "Hello" (p "World"))
procedure
(map-elements proc tx) → txexpr?
proc : procedure? tx : txexpr?
> (define tx '(div "Hello!" (p "Welcome to" (strong "Mars"))))
> (define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
> (map upcaser tx) '(div "HELLO!" (p "Welcome to" (strong "Mars")))
> (map-elements upcaser tx) '(div "HELLO!" (p "WELCOME TO" (strong "MARS")))
In practice, most xexpr-elements are strings. But woe befalls those who pass string procedures to map-elements, because an xexpr-element can be any kind of xexpr?, and an xexpr? is not necessarily a string.
> (define tx '(p "Welcome to" (strong "Mars" amp "Sons")))
> (map-elements string-upcase tx) string-upcase: contract violation
expected: string?
given: 'amp
> (define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
> (map-elements upcaser tx) '(p "WELCOME TO" (strong "MARS" amp "SONS"))
procedure
(map-elements/exclude proc tx exclude-test) → txexpr?
proc : procedure? tx : txexpr? exclude-test : (txexpr? . -> . boolean?)
> (define tx '(div "Hello!" (p "Welcome to" (strong "Mars"))))
> (define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
> (map-elements upcaser tx) '(div "HELLO!" (p "WELCOME TO" (strong "MARS")))
> (map-elements/exclude upcaser tx (λ(x) (equal? (get-tag x) 'strong))) '(div "HELLO!" (p "WELCOME TO" (strong "Mars")))
Be careful with the wider consequences of exclusion tests. When exclude-test is true, the txexpr is excluded, but so is everything underneath that txexpr. In other words, there is no way to re-include (un-exclude?) elements nested under an excluded element.
> (define tx '(div "Hello!" (p "Welcome to" (strong "Mars"))))
> (define upcaser (λ(x) (if (string? x) (string-upcase x) x)))
> (map-elements upcaser tx) '(div "HELLO!" (p "WELCOME TO" (strong "MARS")))
> (map-elements/exclude upcaser tx (λ(x) (equal? (get-tag x) 'p))) '(div "HELLO!" (p "Welcome to" (strong "Mars")))
> (map-elements/exclude upcaser tx (λ(x) (equal? (get-tag x) 'div))) '(div "Hello!" (p "Welcome to" (strong "Mars")))
procedure
(splitf-txexpr tx pred [replace-proc])
→
txexpr? (listof txexpr-element?) tx : txexpr? pred : procedure? replace-proc : procedure? = (λ(x) null)
> (define tx '(div "Wonderful day" (meta "weather" "good") "for a walk"))
> (define is-meta? (λ(x) (and (txexpr? x) (equal? 'meta (get-tag x)))))
> (splitf-txexpr tx is-meta?)
'(div "Wonderful day" "for a walk")
'((meta "weather" "good"))
Ordinarily, the result of the split operation is to remove the elements that match pred. But you can change this behavior with the optional replace-proc argument.
> (define tx '(div "Wonderful day" (meta "weather" "good") "for a walk"))
> (define is-meta? (λ(x) (and (txexpr? x) (equal? 'meta (get-tag x)))))
> (define replace-meta (λ(x) '(em "meta was here")))
> (splitf-txexpr tx is-meta? replace-meta)
'(div "Wonderful day" (em "meta was here") "for a walk")
'((meta "weather" "good"))
procedure
(findf*-txexpr tx pred) → (or/c #f (listof txexpr-element?))
tx : txexpr? pred : procedure?
procedure
(findf-txexpr tx pred) → (or/c #f txexpr-element?)
tx : txexpr? pred : procedure?
> (define tx '(div "Wonderful day" (meta "weather" "good") "for a walk" (meta "dog" "Roxy")))
> (define is-meta? (λ(x) (and (txexpr? x) (eq? 'meta (get-tag x)))))
> (findf*-txexpr tx is-meta?) '((meta "weather" "good") (meta "dog" "Roxy"))
> (findf-txexpr tx is-meta?) '(meta "weather" "good")
> (define is-zimzam? (λ(x) (and (txexpr? x) (eq? 'zimzam (get-tag x)))))
> (findf*-txexpr tx is-zimzam?) #f
> (findf-txexpr tx is-zimzam?) #f
procedure
(check-txexprs-equal? tx1 tx2) → void?
tx1 : txexpr? tx2 : txexpr?
> (define tx1 '(div ((attr-a "foo")(attr-z "bar"))))
> (define tx2 '(div ((attr-z "bar")(attr-a "foo"))))
> (parameterize ([current-check-handler (λ _ (display "not "))]) (display "txexprs are ") (check-txexprs-equal? tx1 tx2) (displayln "equal")) txexprs are equal
If ordering of attributes is relevant to your test, then just use check-equal? as usual.
> (define tx1 '(div ((attr-a "foo")(attr-z "bar"))))
> (define tx2 '(div ((attr-z "bar")(attr-a "foo"))))
> (parameterize ([current-check-handler (λ _ (display "not "))]) (display "txexprs are ") (check-equal? tx1 tx2) (displayln "equal")) txexprs are not equal
6 License & source code
This module is licensed under the LGPL.
Source repository at http://github.com/mbutterick/txexpr. Suggestions & corrections welcome.