NAME
*MIME::Mini* - Minimal code to parse/create mbox files and mail messages
SYNOPSIS
use MIME::Mini ':all';
or:
use MIME::Mini qw(
formail mail2str mail2multipart mail2singlepart mail2mbox
insert_header append_header replace_header delete_header
insert_part append_part replace_part delete_part
header headers header_names
param mimetype encoding filename
body message parts
newparam newmail
);
# Parse mbox file, doing something with each mail message
formail(sub { <> }, sub { my $mail = shift; ...; })
# Create an email with text/plain, image/png, and message/rfc822 attachments
my $mail = newmail
(
To => 'you@there.com',
From => 'me@here.com',
Subject => 'test',
parts => [
newmail(body => "hi\n"),
newmail(body => $png, type => 'image/png', filename => 'hi.png'),
newmail(message => newmail(qw(To to@you From from@me body hi")))
]
);
print mail2str($mail);
DESCRIPTION
*MIME::Mini* is a collection of functions that parse and produce mailbox
files and individual mail messages. It started out as *minimail*, a
non-module cut-and-paste version, intended to be compact enough to cut
and paste directly into perl scripts that don't want to require
non-standard perl modules. *MIME::Mini* is for people that prefer a CPAN
module.
It is intended to be yet another alternative to *MIME-tools*.
*MIME-tools* does things that this code doesn't (such as uuencode and
binhex decoding). And *MIME::Mini* does things that *MIME-tools* doesn't
such as reading and writing mailbox files correctly (repairing
incorrectly formatted ones along the way), and transparently unravelling
"winmail.dat" attachments (aka *MS-TNEF*). *MIME::Mini* is much smaller
(about 3% of the size of *MIME-tools* and the other modules it requires,
and about 20% of the size of *MIME-Lite* (which doesn't parse)), and so
takes much less time during program start up.
FUNCTIONS
formail(sub { <> }, sub { $mail = shift })
Parses a mailbox or a mail message. Calls the first function
argument to retrieve input lines and calls the second function
argument with every mail message found. Terminates when the first
argument returns undef or when the second function returns false.
Quoted "From_" lines are unquoted.
mail2str($mail)
Returns a string version of a mail message. If the mail message
includes a mailbox header, lines in the body starting with "From_"
are quoted and the string result will definitely be terminated with
a blank line. This means that mailbox files with blank lines missing
between mail messages and with unquoted "From_" lines will be
automatically repaired with the code below (Incidentally, malformed
nested multipart body parts are also repaired).
formail(sub { <> }, sub { print mail2str(shift) });
mail2multipart($mail)
Converts a singlepart mail message into a multipart mail message
with a single body part (i.e. the body of the original mail
message). Returns the mail message. Does nothing to mail messages
that are already multipart mail messages.
mail2singlepart($mail)
Converts a multipart mail message with a single body part into a
singlepart mail message whose body is the original body part.
Returns the mail message. Does nothing to mail messages that are
already singlepart mail messages or multipart mail messages with
multiple parts. Acts recursively.
mail2mbox($mail)
Converts a mail message into an mailbox item. Does nothing to mail
messages that are already mailbox items. This affects the result of
*mail2str()*.
insert_header($mail, $header[, $language[, $charset]])
Inserts a new mail header before any existing mail headers. If the
header contains non-ascii characters, it will be encoded in
accordance with RFC2047. If the *$language* and *$charset*
parameters are not supplied, they default to "en" and "iso-8859-1"
(if possible, "utf-8" otherwise), respectively.
append_header($mail, $header[, $language[, $charset]])
Appends a new mail header after any existing mail headers.
replace_header($mail, $header[, $language[, $charset]])
Replaces all instances of a mail header with a new mail header.
delete_header($mail, $header, $recurse)
Deletes all headers that match the *$header* pattern. If the
*$recurse* parameter is provided and non-zero, matching headers in
internal body parts will also be deleted.
insert_part($mail, $part, $index)
Inserts the given body part at the given index. The *$part*
parameter must have been produced by *formail()* or *newmail()*. The
*$mail* parameter must already be a multipart mail message.
append_part($mail, $part)
Appends the given body part.
replace_part($mail, $part, $index)
Replaces the body part at the given index with the given body part.
delete_part($mail, $index)
Deletes the body part at the given index.
header($mail, $header)
Returns a list of values of headers with the given name. RFC2822
comments are removed. If any of the values contain RFC2047 encoded
words (i.e. "=?charset?[qb]?...?="), they are decoded, and the bytes
in the given charset (e.g., "us-ascii", "iso-8859-*", "utf-8") are
then decoded into "characters" (i.e., unicode codepoints). They are
also unfolded. If this is not what you want, use $mail->{header} or
$mail->{headers} directly.
headers($mail)
Returns a list of all complete headers with decoding and unfolding
performed as with *header()*.
header_names($mail)
Returns a list of the names of headers present in the given mail
message.
param($mail, $header, $param)
Returns the value of the given parameter of the given MIME header of
the given mail message. *header()* is used for RFC2047 decoding. If
the parameter has been split or encoded in accordance with RFC2231
(i.e. "param1*0="a" param1*1="b" param2*="charset'lang'%63""), it is
decoded (if "us-ascii" or "iso-8859-*" or "utf-8") and reassembled.
mimetype($mail, $parent)
Returns the declared or default mimetype of the given mail message
or body part. Returns "octet/application" when the encoding is
invalid.
encoding($mail)
Returns the declared or implied encoding of the given mail message
or body part.
filename($part)
Returns the RFC2183 filename of the given body part. Uses *param()*
to perform any decoding that might be necessary. Also removes any
directory component of the filename and replaces any unfriendly
characters with dash characters.
body($mail)
Returns the decoded body of the given mail message or body part.
Must not be called on a multipart mail message or a mail message
whose mimetype is "message/rfc822".
message($mail)
Returns the message inside the given mail message whose mimetype is
"message/rfc822". Must not be called on a multipart message or a
mail message whose mimetype is not "message/rfc822".
parts($mail[, $parts])
When no *$parts* parameter is given, returns a reference to an array
of body parts in the given multipart message. When the *$parts*
parameter is given, it is a reference to an array of body parts, and
it will replace the existing body parts. Must not be called on a
singlepart mail message.
newparam($name, $value[, $language[, $charset]]])
Creates a MIME header parameter, possibly split and encoded in
accordance with RFC2231. Returns a string that looks like ";
name=value" which can be used as part of the *$header* argument in
functions like *append_header()* and as part of any header value in
the function *newmail()*. If the value contains non-ascii
characters, and the *$language* and *$charset* parameters are not
supplied, they default to "en" and "utf-8" or "iso-8859-1",
respectively.
newmail(...)
Creates a new mail message based on the given arguments (which take
the form of a hash). It is not necessary to supply all information.
Anything that needs to be added will be added automatically. The
important parameters are:
[A-Z]* - Arbitrary mail headers: e.g. From To Subject
type - Content-Type: e.g. image/png
charset - Content-Type's charset parameter: e.g. iso-8859-1
encoding - Content-Transfer-Encoding: e.g. base64
filename - Content-Disposition's filename parameter
body - body of the message (don't use with parts or message)
parts - array-ref of parts (don't use with body or message)
message - body of message/rfc822 message (don't use with body or parts)
mbox - Mbox From_ header
Supplying *body* implies "text/plain". Supplying *parts* implies
"multipart/mixed". Supplying *message* implies "message/rfc822".
Default *disposition* is "inline" for "text/*" and "message/rfc822",
or "attachment" for all other types. The default *charset* is
"us-ascii" when *body* contains only ASCII bytes. Otherwise, it is
"utf-8" when *body* is a valid UTF-8 byte sequence. Otherwise, it is
your local (non-utf8) charset, or "iso-8859-1". Default *encoding*
is determined from the type and nature of the mail message and its
data. You shouldn't have to supply *encoding* unless you want to
create messages with "8bit" encoding. If the mail message really is
a mail message, and not just a body part, "Date", "MIME-Version" and
"Message-ID" headers are automatically included if they have not
been supplied by the caller.
Less important parameters are:
disposition - Content-Disposition: i.e. inline or attachment
created - Content-Disposition's creation-date parameter
modified - Content-Disposition's modification-date parameter
read - Content-Disposition's read-date parameter
size - Content-Disposition's size parameter
description - Content-Description
language - Content-Language
duration - Content-Duration
location - Content-Location
base - Content-Base
features - Content-Features
alternative - Content-Alternative
id - Content-ID
md5 - Content-MD5
Note: If you supply "filename" but not "body" (or "message" or
"parts"), and the filename refers to a readable file, then the
following parameters will be determined automatically: "body",
"modified", "read", "size".
The rest of the less important parameters are just shortcuts for
standard MIME headers. There is no support beyond that for any of
them.
STRUCTURE
A mail message (or body part) is a hash containing some of the following
entries:
mbox - mailbox From_ header
warn - parser errors in the form: X-Warning: ...
headers - arrayref of mail headers in order of appearance
header - hashref by name of arrayrefs of mail headers
body - text of singlepart mail message
mime_type - mimetype of the mail message or body part
mime_parts - arrayref of mail messages (body parts)
mime_message - message of a message/rfc822 mail message
mime_boundary - boundary for a multipart mail message
mime_preamble - any text before the first multipart boundary
mime_epilogue - any text after the last multipart boundary
mime_prev_boundary - saved boundary of message after mail2singlepart
mime_prev_preamble - saved preamble of message after mail2singlepart
mime_prev_epilogue - saved epilogue of message after mail2singlepart
Note that *body*, *mime_parts* and *mime_message* are mutually exclusive
and that *mime_type* only exists when *mime_parts* or *mime_message*
exist.
EXAMPLES
Parsing example: Repair mailbox files
formail(sub { <> }, sub { print mail2str(shift) });
Building example: A mail message with attachments
print mail2str(newmail(
To => 'you@there.com', From => 'me@here.com', Subject => 'test',
parts => [
newmail(body => "hi\n"),
newmail(body => $png, type => 'image/png', filename => 'hi.png'),
newmail(message => newmail(qw(To to@you From from@me body hi")))
]));
CAVEAT
The *header()* and *headers()* functions automatically decode RFC2047
encoded headers. This is an attempt to satisfy the following requirement
in RFC2047:
The program must be able to display the unencoded text if the
character set is "US-ASCII". For the ISO-8859-* character sets,
the mail reading program must at least be able to display the
characters which are also in the ASCII set.
Rather than discarding "iso-8859-*" characters that are not also
"us-ascii", *header()* and *headers()* decode them to "characters"
(unicode codeponts) in perl's internal string format. This is arguably
more useful, but knowledge of the original character set is lost.
Hopefully, that isn't important. But actually "displaying" these
characters will require the client application to encode the headers
appropriately for the local system.
The original, encoded headers can be accessed directly via
"$mail->{headers}" which is a reference to an array of raw encoded
headers.
SEE ALSO
RFC2822, RFC2045, RFC2046, RFC2047, RFC2231, RFC2183 (also RFC3282,
RFC3066, RFC2424, RFC2557, RFC2110, RFC3297, RFC2912, RFC2533, RFC1864,
RFC2387, RFC2912, RFC2533, RFC2387, RFC2076).
The mailbox format used is the mboxrd format described in
"http://www.qmail.org/man/man5/mbox.html".
AUTHOR
20240424 raf <raf@raf.org>
COPYRIGHT AND LICENSE
Copyright (C) 2005-2007, 2023-2024 raf <raf@raf.org>
This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.