Automata: Compiling State Machines
Jay McCarthy <jay@racket-lang.org>
(require automata) | package: automata-doc |
This package provides macros and functions for writing state machines over racket/match patterns (as opposed to concrete characters.)
1 Machines
(require automata/machine) | package: automata-lib |
Each of the subsequent macros compile to instances of the machines provided by this module. This is a documented feature of the modules, so these functions should be used to, for example, determine if the machine is currently accepting.
procedure
(machine-accepting? x) → boolean?
x : any/c
procedure
(machine-accepts? m i) → boolean?
m : machine? i : (listof any/c)
value
value
value
procedure
(machine-complement m) → machine?
m : machine?
procedure
(machine-star m) → machine?
m : machine?
procedure
(machine-union m0 m1) → machine?
m0 : machine? m1 : machine?
procedure
(machine-intersect m0 m1) → machine?
m0 : machine? m1 : machine?
procedure
(machine-seq m0 m1) → machine?
m0 : machine? m1 : machine?
procedure
(machine-seq* m0 make-m1) → machine?
m0 : machine? make-m1 : (-> machine?)
2 Deterministic Finite Automata
(require automata/dfa) | package: automata-lib |
This module provides a macro for deterministic finite automata.
syntax
(dfa start (end ...) [state ([evt next-state] ...)] ...)
start : identifier?
end : identifier?
state : identifier?
next-state : identifier?
(define M (dfa s1 (s1) [s1 ([0 s2] [(? even?) s1])] [s2 ([0 s1] [(? even?) s2])]))
> (machine-accepts? M (list 2 0 4 0 2)) #t
> (machine-accepts? M (list 0 4 0 2 0)) #f
> (machine-accepts? M (list 2 0 2 2 0 8)) #t
> (machine-accepts? M (list 0 2 0 0 10 0)) #t
> (machine-accepts? M (list)) #t
> (machine-accepts? M (list 4 0)) #f
3 Non-Deterministic Finite Automata
(require automata/nfa) | package: automata-lib |
This module provides a macro for non-deterministic finite automata.
syntax
(nfa (start:id ...) (end:id ...) [state:id ([evt:expr (next-state:id ...)] ...)] ...)
start : identifier?
end : identifier?
state : identifier?
next-state : identifier?
These machines are efficiently compiled to use the smallest possible bit-string as a set representation and unsafe numeric operations where appropriate for inspection and adjusting the sets.
(define M (nfa (s1 s3) (s1 s3) [s1 ([0 (s2)] [1 (s1)])] [s2 ([0 (s1)] [1 (s2)])] [s3 ([0 (s3)] [1 (s4)])] [s4 ([0 (s4)] [1 (s3)])]))
> (machine-accepts? M (list 1 0 1 0 1)) #t
> (machine-accepts? M (list 0 1 0 1 0)) #t
> (machine-accepts? M (list 1 0 1 1 0 1)) #t
> (machine-accepts? M (list 0 1 0 0 1 0)) #t
> (machine-accepts? M (list)) #t
> (machine-accepts? M (list 1 0)) #f
4 Non-Deterministic Finite Automata (with epsilon transitions)
(require automata/nfa-ep) | package: automata-lib |
This module provides a macro for non-deterministic finite automata with epsilon transitions.
syntax
syntax
(nfa/ep (start:id ...) (end:id ...) [state:id ([epsilon (epsilon-state:id ...)] ... [evt:expr (next-state:id ...)] ...)] ...)
start : identifier?
end : identifier?
state : identifier?
epsilon-state : identifier?
next-state : identifier?
(define M (nfa/ep (s0) (s1 s3) [s0 ([epsilon (s1)] [epsilon (s3)])] [s1 ([0 (s2)] [1 (s1)])] [s2 ([0 (s1)] [1 (s2)])] [s3 ([0 (s3)] [1 (s4)])] [s4 ([0 (s4)] [1 (s3)])]))
> (machine-accepts? M (list 1 0 1 0 1)) #t
> (machine-accepts? M (list 0 1 0 1 0)) #t
> (machine-accepts? M (list 1 0 1 1 0 1)) #t
> (machine-accepts? M (list 0 1 0 0 1 0)) #t
> (machine-accepts? M (list)) #t
> (machine-accepts? M (list 1 0)) #f
5 Regular Expressions
(require automata/re) | package: automata-lib |
This module provides a macro for regular expression compilation.
syntax
(re re-pat)
re-pat = (rec id re-pat) | ,expr | (complement re-pat) | (seq re-pat ...) | (union re-pat ...) | (star re-pat) | epsilon | nullset | re-transformer | (re-transformer . datum) | (dseq pat re-pat) | pat
The interpretation of the pattern language is mostly intuitive. The pattern language may be extended with define-re-transformer. dseq allows bindings of the match pattern to be used in the rest of the regular expression. (Thus, they are not really regular expressions.) unquote escapes to Racket to evaluate an expression that evaluates to a regular expression (this happens once, at compile time.) rec binds a Racket identifier to a delayed version of the inner expression; even if the expression is initially accepting, this delayed version is never accepting.
The compiler will use an NFA, provided complement and dseq are not used. Otherwise, many NFAs connected with the machine simulation functions from automata/machine are used.
syntax
(define-re-transformer id expr)
5.1 Extensions
(require automata/re-ext) | package: automata-lib |
This module provides a few transformers that extend the syntax of regular expression patterns.
syntax
(opt re-pat)
syntax
(plus re-pat)
syntax
(rep re-pat num)
syntax
(difference re-pat_0 re-pat_1)
syntax
(intersection re-pat_0 re-pat_1)
syntax
(seq/close re-pat ...)
5.2 Examples
> (define-syntax-rule (test-re R (succ ...) (fail ...)) (let ([r (re R)]) (printf "Success: ~v => ~v\n" succ (machine-accepts? r succ)) ... (printf "Failure: ~v => ~v\n" fail (machine-accepts? r fail)) ...))
> (test-re epsilon [(list)] [(list 0)])
Success: '() => #t
Failure: '(0) => #f
> (test-re nullset [] [(list) (list 1)])
Failure: '() => #f
Failure: '(1) => #f
> (test-re "A" [(list "A")] [(list) (list "B")])
Success: '("A") => #t
Failure: '() => #f
Failure: '("B") => #f
> (test-re (complement "A") [(list) (list "B") (list "A" "A")] [(list "A")])
Success: '() => #t
Success: '("B") => #t
Success: '("A" "A") => #t
Failure: '("A") => #f
> (test-re (union 0 1) [(list 1) (list 0)] [(list) (list 0 1) (list 0 1 1)])
Success: '(1) => #t
Success: '(0) => #t
Failure: '() => #f
Failure: '(0 1) => #f
Failure: '(0 1 1) => #f
> (test-re (seq 0 1) [(list 0 1)] [(list) (list 0) (list 0 1 1)])
Success: '(0 1) => #t
Failure: '() => #f
Failure: '(0) => #f
Failure: '(0 1 1) => #f
> (test-re (star 0) [(list) (list 0) (list 0 0)] [(list 1)])
Success: '() => #t
Success: '(0) => #t
Success: '(0 0) => #t
Failure: '(1) => #f
> (test-re (opt "A") [(list) (list "A")] [(list "B")])
Success: '() => #t
Success: '("A") => #t
Failure: '("B") => #f
> (define-re-transformer my-opt (syntax-rules () [(_ pat) (union epsilon pat)]))
> (test-re (my-opt "A") [(list) (list "A")] [(list "B")])
Success: '() => #t
Success: '("A") => #t
Failure: '("B") => #f
> (test-re (plus "A") [(list "A") (list "A" "A")] [(list)])
Success: '("A") => #t
Success: '("A" "A") => #t
Failure: '() => #f
> (test-re (rep "A" 3) [(list "A" "A" "A")] [(list) (list "A") (list "A" "A")])
Success: '("A" "A" "A") => #t
Failure: '() => #f
Failure: '("A") => #f
Failure: '("A" "A") => #f
> (test-re (difference (? even?) 2) [(list 4) (list 6)] [(list 3) (list 2)])
Success: '(4) => #t
Success: '(6) => #t
Failure: '(3) => #f
Failure: '(2) => #f
> (test-re (intersection (? even?) 2) [(list 2)] [(list 1) (list 4)])
Success: '(2) => #t
Failure: '(1) => #f
Failure: '(4) => #f
> (test-re (complement (seq "A" (opt "B"))) [(list "A" "B" "C")] [(list "A") (list "A" "B")])
Success: '("A" "B" "C") => #t
Failure: '("A") => #f
Failure: '("A" "B") => #f
> (test-re (seq epsilon 1) [(list 1)] [(list 0) (list)])
Success: '(1) => #t
Failure: '(0) => #f
Failure: '() => #f
> (test-re (seq 1 epsilon) [(list 1)] [(list 0) (list)])
Success: '(1) => #t
Failure: '(0) => #f
Failure: '() => #f
> (test-re (seq epsilon (union (seq (star 1) (star (seq 0 (star 1) 0 (star 1)))) (seq (star 0) (star (seq 1 (star 0) 1 (star 0))))) epsilon) [(list 1 0 1 0 1) (list 0 1 0 1 0) (list 1 0 1 1 0 1) (list 0 1 0 0 1 0) (list)] [(list 1 0)])
Success: '(1 0 1 0 1) => #t
Success: '(0 1 0 1 0) => #t
Success: '(1 0 1 1 0 1) => #t
Success: '(0 1 0 0 1 0) => #t
Success: '() => #t
Failure: '(1 0) => #f
> (test-re (star (complement 1)) [(list 0 2 3 4) (list) (list 2) (list 234 5 9 1 9 0) (list 1 0) (list 0 1)] [(list 1)])
Success: '(0 2 3 4) => #t
Success: '() => #t
Success: '(2) => #t
Success: '(234 5 9 1 9 0) => #t
Success: '(1 0) => #t
Success: '(0 1) => #t
Failure: '(1) => #f
> (test-re (dseq x (? (curry equal? x))) [(list 0 0) (list 1 1)] [(list) (list 1) (list 1 0)])
Success: '(0 0) => #t
Success: '(1 1) => #t
Failure: '() => #f
Failure: '(1) => #f
Failure: '(1 0) => #f