RSound: A Sound Engine for Racket
If it doesn’t work on your machine, please try running (diagnose-sound-playing), and tell me about it!
A note about volume: be careful not to damage your hearing, please. To take a simple example, the sine-wave function generates a sine wave with amplitude 1.0. That translates into the loudest possible sine wave that can be represented. So please set your volume low, and be careful with the headphones. Maybe there should be a parameter that controls the clipping volume. Hmm.
1 Sound Control
These procedures start and stop playing sounds.
2 Stream-based Playing
RSound provides a "pstream" abstraction which falls conceptually in between play and signal-play. In particular, a pstream encapsulates an ongoing signal, with primitives available to queue sounds for playback, to check the signal’s "current time" (in frames), and to queue a callback to occur at a particular time.
This mechanism has two advantages over play; first, it allows you to queue sounds for a particular frame, avoiding hiccups in playback. Second, it only uses a single portaudio stream, rather than the multiple portaudio streams that would occur in multiple calls to play
procedure
(make-pstream [#:buffer-time buffer-time]) → pstream?
buffer-time : (or/c number? #f) = #f
procedure
(pstream-queue pstream rsound frames) → pstream?
pstream : pstream? rsound : rsound? frames : natural?
procedure
(pstream-current-frame pstream) → natural?
pstream : pstream?
procedure
(pstream-play pstream rsound) → pstream?
pstream : pstream? rsound : rsound?
procedure
(pstream-queue-callback pstream callback frames) → pstream? pstream : pstream? callback : procedure? frames : natural?
It’s perhaps worth noting that the callbacks are triggered by semaphore posts, to avoid the possibility of a callback stalling playback. This can mean that the callback is delayed by a few milliseconds.
procedure
(pstream-set-volume! pstream volume) → pstream?
pstream : pstream? volume : real?
3 Recording
RSound now includes basic support for recording sounds.
procedure
(record-sound frames) → rsound?
frames : nat?
4 File I/O
These procedures read and write rsounds from/to disk.
The RSound library reads and writes WAV files only; this means fewer FFI dependencies (the reading & writing is done in Racket), and works on all platforms.
procedure
path : path-string?
It currently has lots of restrictions (it insists on 16-bit PCM encoding, for instance), but deals with a number of common bizarre conventions th-at certain WAV files have (PAD chunks, extra blank bytes at the end of the fmt chunk, etc.), and tries to fail relatively gracefully on files it can’t handle.
Reading in a large sound can result in a very large value (~10 Megabytes per minute); for larger sounds, consider reading in only a part of the file, using rs-read/clip.
procedure
(rs-read/clip path start finish) → rsound?
path : path-string? start : nonnegative-integer? finish : nonnegative-integer?
It currently has lots of restrictions (it insists on 16-bit PCM encoding, for instance), but deals with a number of common bizarre conventions that certain WAV files have (PAD chunks, extra blank bytes at the end of the fmt chunk, etc.), and tries to fail relatively gracefully on files it can’t handle.
procedure
(rs-read-frames path) → nonnegative-integer?
path : path-string?
The file must be encoded as a WAV file readable with rsound-read.
procedure
(rs-read-sample-rate path) → positive-number?
path : path-string?
The file must be encoded as a WAV file readable with rs-read.
procedure
rsound : rsound? path : path-string?
5 Rsound Manipulation
These procedures allow the creation, analysis, and manipulation of rsounds.
struct
(struct rsound (data start end frame-rate) #:extra-constructor-name make-rsound) data : s16vector? start : nonnegative-number? end : nonnegative-number? frame-rate : nonnegative-number?
value
FRAME-RATE : nonnegative-integer?
Note for people not using the beginning student language: this constant is provided because the default-sample-rate parameter isn’t usable in beginning student language.
parameter
(default-sample-rate) → positive-real?
(default-sample-rate frame-rate) → void? frame-rate : positive-real?
= 44100
Note that the terms sample rate and frame rate are used interchangeably. The
term "frame rate" is arguably more correct, because one second of stereo
sound at a frame rate of 44100 actually has 88200 samples—
This procedure is necessary because s16vectors don’t natively support equal?.
procedure
(rs-ith/left rsound frame) → nonnegative-integer?
rsound : rsound? frame : nonnegative-integer?
procedure
(rs-ith/right rsound frame) → nonnegative-integer?
rsound : rsound? frame : nonnegative-integer?
procedure
rsound : rsound? start : nonnegative-integer? finish : nonnegative-integer?
procedure
(rs-append* rsounds) → rsound?
rsounds : (listof rsound?)
procedure
(rs-overlay rsound-1 rsound-2) → rsound?
rsound-1 : rsound? rsound-2 : rsound?
procedure
(rs-overlay* rsounds) → rsound?
rsounds : (listof rsound?)
So, suppose we have two rsounds: one called ’a’, of length 20000, and one called ’b’, of length 10000. Evaluating
(assemble (list (list a 5000) (list b 0) (list b 11000)))
... would produce a sound of 21000 frames, where each instance of ’b’ overlaps with the central instance of ’a’.
procedure
length : frames? mapping-fun : procedure? rsound : rsound?
Samples are chosen using rounding; there is no interpolation done.
procedure
(resample/interp factor sound) → rsound
factor : positive-real? sound : rsound?
My tests of 2014-09-22 suggest that interpolating takes about twice as long. In command-line racket, this amounts to a jump from 1.7% CPU usage to 3.0% CPU usage.
procedure
(resample-to-rate frame-rate sound) → rsound
frame-rate : frame-rate? sound : rsound?
Put differently, the sounds that result from (resample/interp 2.0 ding) and (resample-to-rate 22050 ding) should contain exactly the same set of samples, but the first will have a frame rate of 44100, and the second a frame rate of 22050.
procedure
(build-sound frames generator) → rsound?
frames : frames? generator : procedure?
More specifically, the samples in the sound are generated by calling the procedure with each frame number in the range [0 .. frames-1]. The procedure must return real numbers in the range (-1 .. 1)]. The left and right channels will be identical.
Here’s an example that generates a simple sine-wave (you could also use make-tone for this).
(define VOLUME 0.1) (define FREQUENCY 430) (define (sine-tone f) (* VOLUME (sin (* 2 pi FREQUENCY (/ f FRAME-RATE))))) (build-sound (* 2 FRAME-RATE) sine-tone)
procedure
(vec->rsound s16vec frame-rate) → rsound?
s16vec : s16vector? frame-rate : frame-rate?
6 Signals and Networks
For signal processing, RSound adopts a dataflow-like paradigm, where elements may be joined together to form a directed acyclic graph, which is itself an element that can be joined together, and so forth. So, for instance, you might have a sine wave generator connected to the amplitude input of another sine wave generator, and the result pass through a distortion filter. Each node accepts a stream of inputs, and produces a stream of outputs. I will use the term node to refer to both the primitive elements and the compound elements.
The most basic form of node is simply a procedure. It takes inputs, and produces outputs. In addition, the network form provides support for nodes that are stateful and require initialization.
A node that requires no inputs is called a signal.
Signals can be played directly, with signal-play. They may also be converted to rsounds, using signal->rsound or signals->rsound.
A node that takes one input is called a filter.
syntax
(network (in ...) network-clause ...)
in = identifier network-clause = [node-label = expression] | [node-label <= network expression ...] | [(node-label ...) = expression] | [(node-label ...) <= network expression ...] node-label = identifier
There are two kinds of clause. A clause that uses = simply gives the name to the result of evaluating the right-hand-side expression. A clause that uses <= evaluates the input expressions, and uses them as inputs to the given network.
The special (prev node-label init-val) form may be used to refer to the previous value of the corresponding node. It’s fine to have “forward” references to clauses that haven’t been evaluated yet. The init-val value is used as the previous value the first time the network is used.
The final clause’s node is used as the output of the network.
The network form is useful because it manages the initialization of stateful networks, and allows reference to previous outputs.
Here’s a trivial signal:
(lambda () 3)
Here’s the same signal, written using network:
(network () [out = 3])
This is the signal that always produces 3.
Here’s another one, that counts upward:
(define counter/sig (network () [counter = (+ 1 (prev counter 0))]))
The prev form is special, and is used to refer to the prior value of the signal component.
Note that since we’re adding one immediately, this counter starts at 1.
Here’s another example, that adds together two sine waves, at 34 Hz and 46 Hz, assuming a sample rate of 44.1KHz:
(define sum-of-sines (network () [a <= sine-wave 34] [b <= sine-wave 46] [out = (+ a b)]))
In order to use a signal with signal-play, it should produce a real number in the range -1.0 to 1.0.
Here’s an example that uses one sine-wave (often called an "LFO") to control the pitch of another one:
(define vibrato-tone (network () [lfo <= sine-wave 2] [sin <= sine-wave (+ 400 (* 50 lfo))] [out = (* 0.1 sin)])) (signal-play vibrato-tone) (sleep 5) (stop)
There are many built-in signals. Note that these are documented as though they were procedures, but they’re not; they can be used in a procedure-like way in network clauses. Otherwise, they will behave as opaque values; you can pass them to various signal functions, etc.
Also note that all of these assume a fixed sample rate of 44.1 KHz.
syntax
(prev node-label init-val)
node-label = identifier init-val = expression
signal
(sine-wave frequency) → real?
frequency : nonnegative-number?
signal
(sawtooth-wave frequency) → real?
frequency : nonnegative-number?
signal
(square-wave frequency) → real?
frequency : nonnegative-number?
Also note that since this is a simple 1/-1 square wave, it’s got horrible aliasing all over the spectrum.
signal
(pulse-wave duty-cycle frequency) → real?
duty-cycle : real? frequency : nonnegative-number?
signal
(dc-signal amplitude) → real?
amplitude : real?
The following are functions that return signals.
procedure
(simple-ctr init skip) → signal?
init : real? skip : real?
procedure
(loop-ctr/variable len) → signal?
len : real?
In order to listen to them, you can transform them into rsounds, or play them directly:
procedure
(signal->rsound frames signal) → rsound?
frames : nonnegative-integer? signal : signal?
Here’s an example of using it:
(define sig1 (network () [a <= sine-wave 560] [out = (* 0.1 a)])) (define r (signal->rsound 44100 sig1)) (play r)
procedure
(signals->rsound frames left-sig right-sig) → rsound?
frames : nonnegative-integer? left-sig : signal? right-sig : signal?
procedure
(signal-play signal) → void?
signal : signal?
There are several functions that produce signals.
procedure
(indexed-signal time->amplitude) → signal?
time->amplitude : procedure?
There are also a number of functions that combine existing signals, called "signal combinators":
We can turn an rsound back into a signal, using rsound->signal:
procedure
(rsound->signal/left rsound) → signal?
rsound : rsound?
procedure
(rsound->signal/right rsound) → signal?
rsound : rsound?
procedure
(thresh/signal threshold signal) → signal?
threshold : real-number? signal : signal?
procedure
(clip&volume volume signal) → signal?
volume : real-number? signal : signal?
Where should these go?
procedure
(thresh threshold input) → real-number?
threshold : real-number? input : real-number?
Finally, here’s a predicate. This could be a full-on contract, but I’m afraid of the overhead.
procedure
(signal? maybe-signal) → boolean?
maybe-signal : any/c
procedure
(filter? maybe-filter) → boolean?
maybe-filter : any/c
6.1 Signal/Blocks
The signal/block interface can speed up sound generation, by allowing a signal to generate a block of samples at once. This is particularly valuable when it is possible for signals to use c-level primitives to copy blocks of samples.
UNFINISHED:
procedure
(signal/block-play signal/block sample-rate #:buffer-time buffer-time) → any signal/block : signal/block/unsafe? sample-rate : positive-integer? buffer-time : (or/c nonnegative-number #f)
7 Visualizing Rsounds
(require rsound/draw) | package: rsound |
procedure
(rs-draw rsound #:title title [ #:width width #:height height]) → void? rsound : rsound? title : string? width : nonnegative-integer? = 800 height : nonnegative-integer? = 200
procedure
(rsound-fft-draw rsound #:zoom-freq zoom-freq #:title title [ #:width width #:height height]) → void? rsound : rsound? zoom-freq : nonnegative-real? title : string? width : nonnegative-integer? = 800 height : nonnegative-integer? = 200
procedure
(rsound/left-1-fft-draw rsound #:title title #:width width #:height height) → void? rsound : rsound? title : string? width : 800 height : 200
procedure
(vector-pair-draw/magnitude left right #:title title [ #:width width #:height height]) → void? left : (fcarrayof complex?) right : (vectorof complex?) title : string? width : nonnegative-integer? = 800 height : nonnegative-integer? = 200
procedure
(vector-draw/real/imag vec #:title title [ #:width width #:height height]) → void? vec : (fcarrayof complex?) title : string? width : nonnegative-integer? = 800 height : nonnegative-integer? = 200
8 RSound Utilities
procedure
(make-harm3tone frequency volume? frames frame-rate) → rsound? frequency : nonnegative-number? volume? : nonnegative-number? frames : nonnegative-integer? frame-rate : nonnegative-number?
procedure
pitch : nonnegative-number? volume : nonnegative-number? duration : nonnegative-exact-integer?
procedure
(rs-fft/left rsound) → (fcarrayof complex?)
rsound : rsound?
Changed in version 20151120.0 of package rsound: Was named rsound-fft/left.
procedure
(rs-fft/right rsound) → (fcarrayof complex?)
rsound : rsound?
Changed in version 20151120.0 of package rsound: Was named rsound-fft/right.
procedure
(midi-note-num->pitch note-num) → number?
note-num : nonnegative-integer?
procedure
(pitch->midi-note-num pitch) → nonnegative-real?
pitch : nonnegative-real?
9 Piano Tones
(require rsound/piano-tones) | package: rsound |
10 Envelopes
(require rsound/envelope) | package: rsound |
procedure
(sine-window len fade-in) → rsound?
len : frames? fade-in : frames
procedure
(hann-window len) → rsound?
len : frames?
11 Frequency Response
(require rsound/frequency-response) | package: rsound |
procedure
(response-plot poly dbrel min-freq max-freq) → void?
poly : procedure? dbrel : real? min-freq : real? max-freq : real
procedure
(poles&zeros->fun poles zeros) → procedure?
poles : (listof real?) zeros : (listof real?)
(response-plot (poles&zeros->fun '(0.5 0.5+0.5i 0.5-0.5i) '(0+1i 0-1i)) 40 0 22050)
12 Filtering
RSound provides a dynamic low-pass filter, among other things.
procedure
(fir-filter delay-lines) → network?
delay-lines : (listof (list/c nonnegative-exact-integer? real-number?))
So, for instance,
(fir-filter (list (list 13 0.4) (list 4 0.1)))
...would produce a filter that added the current frame to 4/10 of the input frame 13 frames ago and 1/10 of the input frame 4 frames ago.
procedure
(iir-filter delay-lines) → network?
delay-lines : (listof (list/c nonnegative-exact-integer? real-number?))
So, for instance,
(iir-filter (list (list 13 0.4) (list 4 0.1)))
...would produce a filter that added the current frame to 4/10 of the output frame 13 frames ago and 1/10 of the output frame 4 frames ago.
Here’s an example of code that uses a simple comb filter to extract a 3-second buzzing sound at 300 Hz from noise:
(define comb-level 0.99) (play (signal->rsound (* 44100 3) (network () [r = (random)] ; a random number from 0 to 1 [r2 = (* r 0.1)] ; scaled to make it less noisy ; apply the comb filter: [o2 <= (iir-filter (list (list 147 comb-level))) r] ; compensate for the filter's gain: [out = (* (- 1 comb-level) o2)])))
signal
(lpf/dynamic control input) → signal?
control : number? input : number?
(signal->rsound 88200 (network () [f <= (simple-ctr 0 1)] [sawtooth = (/ (modulo f 220) 220)] [control = (+ 0.5 (* 0.2 (sin (* f 7.123792865282977e-05))))] [out <= lpf/dynamic control sawtooth]))
13 Single-cycle sounds
(require rsound/single-cycle) | package: rsound |
procedure
(synth-note family spec midi-note-number duration) → rsound family : string? spec : number-or-path? midi-note-number : natural? duration : natural?
(synth-note "vgame" 49 60 22010)
procedure
(synth-note/raw family spec midi-note-number duration) → rsound family : string? spec : number-or-path? midi-note-number : natural? duration : natural?
procedure
(synth-waveform family spec) → rsound
family : string? spec : number-or-path?
14 Helper Functions
15 Configuration
procedure
procedure
(all-host-apis) → (listof symbol?)
procedure
(set-host-api! api) → void?
api : (or/c false? string?)
procedure
procedure
(set-output-device! index) → void
index : (or/c false? natural?)
16 Fsounds
(require rsound/fsound) | package: rsound |
procedure
rs : rsound?
procedure
(fsound->rsound fs) → rsound?
fs : fsound?
procedure
fs : (vectorof real?) sample-rate : exact-positive-integer?
17 Sample Code
An example of a signal that plays two lines, each with randomly changing square-wave tones. This one runs in the Intermediate student language:
(require rsound) ; scrobble: number number number -> signal ; return a signal that generates square-wave tones, changing ; at the given interval into a new randomly-chosen frequency ; between lo-f and hi-f (define (scrobble change-interval lo-f hi-f) (local [(define freq-range (floor (- hi-f lo-f))) (define (maybe-change f l) (cond [(= l 0) (+ lo-f (random freq-range))] [else f]))] (network () [looper <= (loop-ctr change-interval 1)] [freq = (maybe-change (prev freq 400) looper)] [a <= square-wave freq]))) (define my-signal (network () [a <= (scrobble 4000 200 600)] [b <= (scrobble 40000 100 200)] [lpf-wave <= sine-wave 0.1] [c <= lpf/dynamic (max 0.01 (abs (* 0.5 lpf-wave))) (+ a b)] [b = (* c 0.1)])) ; write 20 seconds to a file, if uncommented: ; (rs-write (signal->rsound (* 20 44100) my-signal) "/tmp/foo.wav") ; play the signal (signal-play my-signal)
An example of a signal that plays from one of the single-cycle vgame tones:
#lang racket (require rsound) (define waveform (synth-waveform "vgame" 4)) ; wrap i around when it goes off the end: (define (maybe-wrap i) (cond [(< i 44100) i] [else (- i 44100)])) ; a signal that plays from a waveform: (define loop-sig (network (pitch) [i = (maybe-wrap (+ (prev i 0) (round pitch)))] [out = (rs-ith/left waveform i)])) (signal-play (network () [alternator <= square-wave 2] [s <= loop-sig (+ (* 200 (inexact->exact alternator)) 400)] [out = (* s 0.1)]))
18 Reporting Bugs
For Heaven’s sake, report lots of bugs!