mboxrd-read
mboxrd-parse
mboxrd-parse/  port
1 mboxcl2
mboxcl2-parse
6.3.90.900

mboxrd-read

 (require mboxrd-read) package: mboxrd-read
This package parses mboxrd files, also known as "normal UNIX mbox files", into lazy lists of messages.
Oh. Um, actually, it now parses mboxcl2 files, as well. Scope creep, I know.

procedure

(mboxrd-parse path)  (stream/c (list/c bytes? bytes?))

  path : path?
given a path to an mbox file, return a stream of the messages in the file. Each file is represented as a list containing a byte-string representing the header and a byte-string representing the body. These byte-strings can be appended to obtain the original message except that every \n in the original is replaced by \r\n to match the RFC 2822 format.

procedure

(mboxrd-parse/port port)  (stream/c (list/c bytes? bytes?))

  port : input-port?
given an input port, return a lazy list of the messages in the port.

NB: this procedure assumes that it’s the only one reading the port. Bad stuff will happen if its not; it doesn’t leave the "From " of the next message on the stream.

EFFECT: reads from stream, closes it when peek-char returns #<eof>

1 mboxcl2

Well, it turns out that dovecot actually uses mboxcl2. Ah well. In fact, mboxcl2 looks like a bit of a win; since it uses Content-Length to locate the next header, it should be possible to parse faster, since you can set the file position rather than scanning those hideously long base64 body strings looking for the next line starting with "From ". The down side is that since the body strings aren’t read eagerly, closing the file port is a separate operation that you’re responsible for.

procedure

(mboxcl2-parse path)  
(-> void?)
(stream/c (list/c bytes? (-> bytes?)))
  path : path-string?
given an input port, returns a closer function that closes the input port associated with the file, and a list of lists containing a header byte-string and a thunk that returns the body bytes.

Please note that the header gets rfc822-style newlines, but the body does not.

Note that after the closer function is called, it’s not possible to extend the lazy list *or* to extract bodies.

Additionally, you can use the utilities (e.g. extract-field) in "net/head.ss" to process the header.

Let me know of any bugs.

John Clements