Threading Macros
(require threading) | package: threading |
The threading module provides a set of macros that help flatten nested function calls. They allow value transformations to be expressed as “pipelines” of values, similar to Unix pipes. These are called threading macros, and there are a number of different variations for different purposes, but the most basic version is ~>.
1 Guide
1.1 Introduction
Threading macros are used to take a value and “thread” it through a series of transformations to produce a new value. In their simplest forms, they are just convenient syntax for function composition:
> (~> 1 add1 sqrt) 1.4142135623730951
The above example is equivalent to the following:
> (sqrt (add1 1)) 1.4142135623730951
While function composition evaluates right to left, ~> threads the value from left to right.
This on its own is not terribly useful, but the benefit becomes more obvious when confronted with a complicated nested expression:
> (- (bytes-ref (string->bytes/utf-8 (symbol->string 'abc)) 1) 2) 96
The above expression is hard to read, especially since the nesting causes some of the arguments to be pushed out to the right of the initial value, making it especially hard to see which function is receiving those arguments. Using ~>, this can be converted into an orderly pipeline:
> (~> 'abc symbol->string string->bytes/utf-8 (bytes-ref 1) (- 2)) 96
Note how the data flows from top to bottom in an orderly manner. Also note how some of the clauses provided to ~> are contained within parentheses, while others are not. When no extra arguments are provided to a function, the parentheses may be elided.
1.2 How ~> works
To understand better what is happening when ~> is used, remember that it is just a macro, and it is actually expanding precisely to the original unthreaded example. Each step of the pipeline is successfully nested, placing the previous expression as the first argument provided to the enclosing function. This is more easily demonstrated with an example. To start, consider the “normalized” version of the expression, where all clauses are wrapped in parentheses:
(~> 'abc (symbol->string) (string->bytes/utf-8) (bytes-ref 1) (- 2))
To begin, the 'abc value is threaded into the first clause in the first argument position:
(~> (symbol->string 'abc) (string->bytes/utf-8) (bytes-ref 1) (- 2))
This process continues, threading the new value into the next function in the pipeline:
(~> (string->bytes/utf-8 (symbol->string 'abc)) (bytes-ref 1) (- 2))
The next step is slightly more complicated, but not by much. The next clause already has an argument, but the expansion process is the same: the "current" value is just inserted before the provided argument:
(~> (bytes-ref (string->bytes/utf-8 (symbol->string 'abc)) 1) (- 2))
Finally, the whole expression is inserted into the last clause:
(~> (- (bytes-ref (string->bytes/utf-8 (symbol->string 'abc)) 1) 2))
Now the expansion is effectively complete—(~> x) just expands to x directly with no further transformations.
1.3 Changing the threading position
The above example worked because each expression needed to be provided as the first argument for each clause. This, of course, it not always the case. For example, it is frequently the opposite order that is desired when operating on lists:
> (foldl + 0 (map sqr '(1 2 3))) 14
In this example, using ~> would not work because the list needs to be provided as the final argument to map and foldl. Instead of using ~>, this can be achieved using its counterpart, ~>>:
> (~>> '(1 2 3) (map sqr) (foldl + 0)) 14
The ~>> form works exactly like ~>, but expressions are threaded into the final position instead of the first.
Of course, there are times when the threading position may be inconsistent, or perhaps it needs to be in the middle of a function call instead of at the beginning or the end. In this case, the threading position can be explicitly specified by marking the "hole" with the _ identifier:
> (~> '(1 2 3) (map add1 _) (apply + _) (- 1)) 8
Using this "hole marker" works with both ~> and ~>>.
1.4 Threading operations that can fail
It is a Racket convention that, when an operation fails, it returns #f. Not all functions use this convention—sometimes it is more apt to throw an exception—but when failure is a valid state, #f is a convenient placeholder since it is the only falsy value in Racket.
This is a useful convention, and it works fairly well, but it can sometimes get in the way of threading. For example, consider the following threading expression:
(~>> lst (findf even?) (* 2))
It finds the first even number in a list, then multiplies it by two. However, if no even number is in the list, then it will return #f! This will obviously cause a problem when we attempt to multiply by two:
> (~>> '(1 3 5) (findf even?) (* 2)) *: contract violation
expected: number?
given: #f
argument position: 2nd
other arguments...:
2
This is a problem because it means the value needs to be checked in between the call to findf and the multiplication. If the result of findf is #f, the whole expression should, ideally, be #f. Otherwise, the result should be the number multiplied by two.
Two alternative threading macros, and~> and and~>>, are just like their ordinary counterparts, but they will short-circuit to #f if any intermediate values are #f, just like and. By using and~>>, the above expression will work correctly:
> (and~>> '(1 3 5) (findf even?) (* 2)) #f
> (and~>> '(1 4 5) (findf even?) (* 2)) 8
1.5 Creating functions that thread
The ordinary threading operations accept a value and immediately thread it through the provided expressions, but sometimes it’s helpful to simply produce a function that represents a threading pipeline, instead. For example, one might create a function to convert a symbol to bytes:
(lambda (x) (~> x symbol->string string->bytes/utf-8))
This use-case is common enough that there is a shorthand syntax, lambda~>.
> (define symbol->bytes (lambda~> symbol->string string->bytes/utf-8))
> (symbol->bytes 'abc) #"abc"
Arguments can be provided to lambda~>, just like ~>:
> ((lambda~> (+ 3) (* 2)) 5) 16
In addition to lambda~>, there is lambda~>>, as well as lambda-and~> and its counterpart. All of these forms also have shorthand aliases using λ, such as λ~>.
Finally, each threading lambda form has an additional counterpart that accepts any number of arguments as a list instead of just taking a single argument:
> ((lambda~>>* (map add1) (foldl * 1)) 1 2 3) 24
2 Reference
syntax
(~> expr clause ...)
clause = bare-id | (fn-expr arg-expr ...) | (fn-expr pre-expr ... hole-marker post-expr ...) hole-marker = _
As a special case, clauses of the form 'datum are treated as if they were ('datum) so the threaded value is inserted outside the quote form. This isn’t useful (or harmful) in the Racket language, but it may be be useful in other languages with a modified #%app.
Once the initial transformation has been completed, the expr is threaded through the clauses by nesting it within each clause, replacing the hole marker.
> (~> '(1 2 3) (map add1 _) second (* 2)) 6
> (~> "foo" string->bytes/utf-8 bytes->list (map (curry * 2) _) list->bytes) #"\314\336\336"
syntax
(~>> expr clause ...)
clause = bare-id | (fn-expr arg-expr ...) | (fn-expr pre-expr ... hole-marker post-expr ...) hole-marker = _
> (~>> '(1 2 3) (map add1) second (* 2)) 6
> (~>> "foo" string->bytes/utf-8 bytes->list (map (curry * 2)) list->bytes) #"\314\336\336"
syntax
(and~> expr clause ...)
syntax
(and~>> expr clause ...)
syntax
(lambda~>>* clause ...)
syntax
(λ~>>* clause ...)
syntax
(lambda-and~> clause ...)
syntax
(λ-and~> clause ...)
syntax
(lambda-and~>> clause ...)
syntax
(λ-and~>> clause ...)
syntax
(lambda-and~>* clause ...)
syntax
(λ-and~>* clause ...)
syntax
(lambda-and~>>* clause ...)
syntax
(λ-and~>>* clause ...)