NAME
MIME-tools - modules for parsing (and creating!) MIME entities
SYNOPSIS
Here's some pretty basic code for parsing a MIME message, and outputting its decoded components to a given directory:
use MIME::Parser;
# Create parser, and set the output directory:
my $parser = new MIME::Parser;
$parser->output_dir("$ENV{HOME}/mimemail");
# Parse input:
$entity = $parser->read(\*STDIN) or die "couldn't parse MIME stream";
# Take a look at the top-level entity (and any parts it has):
$entity->dump_skeleton;
Here's some code which composes and sends a MIME message containing three parts: a text file, an attached GIF, and some more text:
use MIME::Entity;
# Create the top-level, and set up the mail headers:
$top = build MIME::Entity Type =>"multipart/mixed",
-From => "me\@myhost.com",
-To => "you\@yourhost.com",
-Subject => "Hello, nurse!";
# Attachment #1: a simple text document:
attach $top Path=>"./testin/short.txt";
# Attachment #2: a GIF file:
attach $top Path => "./docs/mime-sm.gif",
Type => "image/gif",
Encoding => "base64";
# Attachment #3: some literal text:
attach $top Data=>$message;
# Send it:
open MAIL, "| /usr/lib/sendmail -t -i" or die "open: $!";
$top->print(\*MAIL);
close MAIL;
DESCRIPTION
MIME-tools is a collection of Perl5 MIME:: modules for parsing, decoding, and generating single- or multipart (even nested multipart) MIME messages. (Yes, kids, that means you can send messages with attached GIF files).
A QUICK TOUR
Parsing, in a nutshell
You usually start by creating an instance of MIME::Parser (a subclass of the abstract MIME::ParserBase), and setting up certain parsing parameters: what directory to save extracted files to, how to name the files, etc.
You then give that instance a readable filehandle on which waits a MIME message. If all goes well, you will get back a MIME::Entity object (a subclass of Mail::Internet), which consists of...
A MIME::Head (a subclass of Mail::Header) which holds the MIME header data.
A MIME::Body, which is a object that knows where the body data is. You ask this object to "open" itself for reading, and it will hand you back an "I/O handle" for reading the data: this is a FileHandle-like object, and could be of any class, so long as it conforms to a subset of the IO::Handle interface.
If the original message was a multipart document, the MIME::Entity object will have a non-empty list of "parts", each of which is in turn a MIME::Entity (which might also be a multipart entity, etc, etc...).
Internally, the parser (in MIME::ParserBase) asks for instances of MIME::Decoder whenever it needs to decode an encoded file. MIME::Decoder has a mapping from supported encodings (e.g., 'base64') to classes whose instances can decode them. You can add to this mapping to try out new/experiment encodings. You can also use MIME::Decoder by itself.
Composing, in a nutshell
On a small scale, the MIME::Decoder can be used to encode as well. All the standard encodings are supported:
7bit Use this for plain ASCII documents (and multiparts)
8bit
binary
quoted-printable Use this for text files with 8-bit characters
base64 Use this for binary files
When encoding a text document as a 7bit
mail message, the software will not puke on 8-bit characters... instead, the 8-bit characters are escaped for you into reasonable ASCII sequences, by the MIME::Latin1 module. This feature is for folks who really hate sending out a document as quoted-printable just because it happens to have a couple of French or German names.
I've considered making it so that the content-type and encoding can be automatically inferred from the file's path, but that seems to be asking for trouble... or at least, for Mail::Cap...
Other stuff
If you want to tweak the way this toolkit works (for example, to turn on debugging), use the routines in the MIME::ToolUtils module.
CONTENTS
Modules in this toolkit
Module DSLI Description Info
---------- ---- ------------------------------------------ ----
MIME::
::Body adpO Abstract message holder (file, scalar, etc.) ERYQ
::Decoder bdpO OO interface for decoding MIME messages ERYQ
::Entity bdpO An extracted and decoded MIME entity ERYQ
::Field::* bdpO Mail::Field subclasses for parsing fields ERYQ
::Head bdpO A parsed MIME header (Mail::Header subclass) ERYQ
::IO adpO Simple I/O handles for filehandles/scalars ERYQ
::Latin1 adpO Encoding 8-bit Latin-1 as 7-bit ASCII ERYQ
::Parser bdpO Parses streams to create MIME entities ERYQ
::ParserBase bdpO For building your own MIME parser ERYQ
::ToolUtils adpO For tweaking the MIME-tools library ERYQ
Programs in this toolkit
mimedump - dump out a summary of the contents of a MIME message
mimeexplode - parse/decode a MIME message into its component files
mimesend - send a message with attachments from the command line
MANIFEST
./MIME/*.pm the MIME-tools classes
./Makefile.PL the input to MakeMaker
./COPYING terms and conditions for copying/using the software
./README this file
./docs/ HTMLized documentation
./etc/ convenient copies of other modules you may need
./examples sample executables
./t/*.t the "make test" scripts
./testin/ files you can use for testing (as in "make test")
./testout/ the output of "make test"
REQUIREMENTS
You'll need Perl5.002 or better.
You'll need to obtain and install the following kits from the CPAN:
- MIME::(QuotedPrint, Base64)
-
These perform the low-level MIME decoding. Get these from Gisle Aas' author directory. They are also reported to be in the LWP distribution.
- MailTools (1.06 or higher)
-
This is Graham Barr's revamped set of Mail:: modules. Many of them are now superclasses of the MIME:: modules, and perform the core functionality of manipulating headers and fields.
For your convenience, possibly-old copies of the MIME:: modules are provided in the ./etc directory, of the distribution, but they are NOT installed for you during the installation procedure.
INSTALLATION
Pretty simple:
1. Gunzip and de-tar the distribution, and cd to the top level.
2. Type: perl Makefile.PL
3. Type: make # this step is optional
4. Type: make test # this step is optional
5. Type: make install
Other interesting targets in the Makefile are:
make config # to check if the Makefile is up-to-date
make clean # delete local temp files (Makefile gets renamed)
make realclean # delete derived files (including ./blib)
If you're installing this as a replacment for MIME-parser 1.x or earlier, please read the "Compatibility" notes.
NOTES
Compatibility
If you're installing this as a replacement for the MIME-parser 1.x release, and you really don't want to break existing code, you should do this at any point before the parsing code is invoked:
use MIME::ToolUtils;
config MIME::ToolUtils EMULATE_VERSION => 1.0;
Try not to get too attached to this, though. Instead, plan on upgrading your code ASAP to the 2.0 style.
Design issues
- Why assume that MIME objects are email objects?
-
I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's using the word header where I would use field; e.g., to refer to "Subject:", "Content-type:", etc.):
There is also IMHO no requirement [for] MIME::Heads to look like [email] headers; so to speak, the MIME::Head [simply stores] the attributes of a complex object, e.g.: new MIME::Head type => "text/plain", charset => ..., disposition => ..., ... ;
I agree in principle, but (alas and dammit) RFC-1521 says otherwise. RFC-1521 [MIME] headers are a syntactic subset of RFC-822 [email] headers. Perhaps a better name for these modules would be RFC1521:: instead of MIME::, but we're a little beyond that stage now. (Note: RFC-1521 has recently been obsoleted by RFCs 2045-2049, so it's just as well we didn't go that route...)
However, in my mind's eye, I see a mythical abstract class which does what Achim suggests... so you could say:
my $attrs = new MIME::Attrs type => "text/plain", charset => ..., disposition => ..., ... ;
We could even make it a superclass or companion class of MIME::Head, such that MIME::Head would allow itself to be initiallized from a MIME::Attrs object.
In the meanwhile, look at the build() and attach() methods of MIME::Entity: they follow the spirit of this mythical class.
- To subclass or not to subclass?
-
When I originally wrote these modules for the CPAN, I agonized for a long time about whether or not they really should subclass from Mail::Internet (then at version 1.17). There were plusses:
Software reuse.
Inheritance of the mail-sending utilities.
And, unfortunately, minuses:
The Mail::Internet 1.17 model of messages as being short enough to fit into in-core arrays is excellent for most email applications; however, it seemed ill-suited for generic MIME applications, where MIME streams could be megabytes long.
The implementation of Mail::Internet 1.17 was excellent for certain kinds of header manipulation, but the implementation of
get()
was less-efficient than I would have liked for MIME applications.In my heart of hearts, I honestly felt that the head should be encapsulated as a first-class object, and in Mail::Internet 1.17 it was not.
So I chose to make MIME::Head and MIME::Entity their own standalone modules.
Since that time, I worked with Graham Barr (author of most of the MailTools package, and a darn nice guy to "work" with over email), and he has graciously evolved the MailTools modules into a direction that addressed a lot of these issues.
When MailTools hit its 1.06 release, it was finally time to finish what I had started, and release MIME-tools 2.0. We now are almost at the stage of a fully-integrated Mail/MIME environment.
Questionable practices
- Fuzzing of CRLF and newline on input
-
RFC-1521 dictates that MIME streams have lines terminated by CRLF (
"\r\n"
). However, it is extremely likely that folks will want to parse MIME streams where each line ends in the local newline character"\n"
instead.An attempt has been made to allow the parser to handle both CRLF and newline-terminated input.
See MIME::ParserBase for details.
- Fuzzing of CRLF and newline when decoding
-
The
"7bit"
and"8bit"
decoders will decode both a"\n"
and a"\r\n"
end-of-line sequence into a"\n"
.The
"binary"
decoder (default if no encoding specified) still outputs stuff verbatim... so a MIME message with CRLFs and no explicit encoding will be output as a text file that, on many systems, will have an annoying ^M at the end of each line... but this is as it should be.See MIME::ParserBase for details.
- Fuzzing of CRLF and newline when encoding/composing
-
All encoders currently output the end-of-line sequence as a
"\n"
, with the assumption that the local mail agent will perform the conversion from newline to CRLF when sending the mail.However, there probably should be an option to output CRLF as per RFC-1521. I'm currently working on a good mechanism for this.
See MIME::ParserBase for details.
- Inability to handle multipart boundaries with embedded newlines
-
First, let's get something straight: this is an evil, EVIL practice. If your mailer creates multipart boundary strings that contain newlines, give it two weeks notice and find another one. If your mail robot receives MIME mail like this, regard it as syntactically incorrect, which it is.
See MIME::ParserBase for details.
CHANGE LOG
Current events
- Version 3.203
-
No, there haven't been any major changes between 2.x and 3.x. The major-version increase was from a few more tweaks to get $VERSION to be calculated better and more efficiently (I had been using RCS version numbers in a way which created problems for users of CPAN::). After a couple of false starts, all modules have been upgraded to RCS 3.201 or higher.
You can now parse a MIME message from a scalar, an array-of-scalars, or any MIME::IO-compliant object (including IO:: objects.) Take a look at parse_data() in MIME::ParserBase. The parser code has been modified to support the MIME::IO interface. Thanks to fellow Chicagoan Tim Pierce (and countless others) for asking.
More sensible toolkit configuration. A new config() method in MIME::ToolUtils makes a lot of toolkit-wide configuration cleaner. Your old calls will still work, but with deprecation warnings.
You can now sign messages just like in Mail::Internet. See MIME::Entity for the interface.
You can now remove signatures from messages just like in Mail::Internet. See MIME::Entity for the interface.
You can now compute/strip content lengths and other non-standard MIME fields. See sync_headers() in MIME::Entity. Thanks to Tim Pierce for bringing the basic problem to my attention.
Many warnings are now silent unless $^W is true. That means unless you run your Perl with -w, you won't see deprecation warnings, non-fatal-error messages, etc. But of course you run with -w, so this doesn't affect you.
:-)
Completed the 7-bit encodings in MIME::Latin1. We hadn't had complete coverage in the conversion from 8- to 7-bit; now we do. Thanks to Rolf Nelson for bringing this to my attention.
Fixed broken parse_two() in MIME::ParserBase. BTW, if your code worked with the "broken" code, it should still work. Thanks again to Tim Pierce for bringing this to my attention.
- Version 2.14
-
Just a few bug fixes to improve compatibility with Mail-Tools 1.08, and with the upcoming Perl 5.004 release. Thanks to Jason L. Tibbitts III for reporting the problems so quickly.
- Version 2.13
-
- New features
-
Added RFC-1522-style decoding of encoded header fields. Header decoding can now be done automatically during parsing via the new
decode()
method in MIME::Head... just tell your parser object that you want todecode_headers()
. Thanks to Kent Boortz for providing the idea, and the baseline RFC-1522-decoding code!Building MIME messages is even easier. Now, when you use MIME::Entity's
build()
orattach()
, you can also supply individual mail headers to set (e.g.,-Subject
,-From
,-To
).Added
Disposition
to MIME::Entity'sbuild()
method. Thanks to Kurt Freytag for suggesting this feature.An
X-Mailer
header is now output by default in all MIME-Entity-prepared messages, so any bad MIME we generate can be traced back to this toolkit.Added
purge()
method to MIME::Entity for deleteing leftover files. Thanks to Jason L. Tibbitts III for suggesting this feature.Added
seek()
andtell()
methods to built-in MIME::IO classes. Only guaranteed to work when reading! Thanks to Jason L. Tibbitts III for suggesting this feature.When parsing a multipart message with apparently no boundaries, the error message you get has been improved. Thanks to Andreas Koenig for suggesting this.
- Bug fixes
-
Patched over a Perl 5.002 (and maybe earlier and later) bug involving FileHandle::new_tmpfile. It seems that the underlying filehandles were not being closed when the FileHandle objects went out of scope! There is now an internal routine that creates true FileHandle objects for anonymous temp files. Thanks to Dragomir R. Radev and Zyx for reporting the weird behavior that led to the discovery of this bug.
MIME::Entity's
build()
method now warns you if you give it an illegal boundary string, and substitutes one of its own.MIME::Entity's
build()
method now generates safer, fully-RFC-1521-compliant boundary strings.Bug in MIME::Decoder's
install()
method was fixed. Thanks to Rolf Nelson and Nickolay Saukh for finding this.Changed FileHandle::new_tmpfile to FileHandle->new_tmpfile, so some Perl installations will be happier. Thanks to Larry W. Virden for finding this bug.
Gave
=over
an arg of 4 in all PODs. Thanks to Larry W. Virden for pointing out the problems of bare =over's
- Version 2.04
-
A bug in MIME::Entity's output method was corrected. MIME::Entity::print now outputs everything to the desired filehandle explicitly. Thanks to Jake Morrison for pointing out the incompatibility with Mail::Header.
- Version 2.03
-
Fixed bug in autogenerated filenames resulting from transposed "if" statement in MIME::Parser, removing spurious printing of header as well. (Annoyingly, this bug is invisible if debugging is turned on!) Thanks to Andreas Koenig for bringing this to my attention.
Fixed bug in MIME::Entity::body() where it was using the bodyhandle completely incorrectly. Thanks to Joel Noble for bringing this to my attention.
Fixed MIME::Head::VERSION so CPAN:: is happier. Thanks to Larry Virden for bringing this to my attention.
Fixed undefined-variable warnings when dumping skeleton (happened when there was no Subject: line) Thanks to Joel Noble for bringing this to my attention.
- Version 2.02
-
Stupid, stupid bugs in both BASE64 encoding and decoding were fixed. Thanks to Phil Abercrombie for locating them.
- Version 2.01
-
Modules now inherit from the new Mail:: modules! This means big changes in behavior.
MIME::Parser can now store message data in-core. There were a lot of requests for this feature.
MIME::Entity can now compose messages. There were a lot of requests for this feature.
Added option to parse
"message/rfc822"
as a pseduo-multipart document. Thanks to Andreas Koenig for suggesting this.
Ancient history
- Version 1.13
-
MIME::Head now no longer requires space after ":", although either a space or a tab after the ":" will be swallowed if there. Thanks to Igor Starovoitov for pointing out this shortcoming.
- Version 1.12
-
Fixed bugs in parser where CRLF-terminated lines were blowing out the handling of preambles/epilogues. Thanks to Russell Sutherland for reporting this bug.
Fixed idiotic is_multipart() bug. Thanks to Andreas Koenig for noticing it.
Added untested binmode() calls to parser for DOS, etc. systems. No idea if this will work...
Reorganized the output_path() methods to allow easy use of inheritance, as per Achim Bohnet's suggestion.
Changed MIME::Head to report mime_type more accurately.
POSIX module no longer loaded by Parser if perl >= 5.002. Hey, 5.001'ers: let me know if this breaks stuff, okay?
Added unsupported ./examples directory.
- Version 1.11
-
Converted over to using Makefile.PL. Thanks to Andreas Koenig for the much-needed kick in the pants...
Added t/*.t files for testing. Eeeeeeeeeeeh...it's a start.
Fixed bug in default parsing routine for generating output paths; it was warning about evil filenames if there simply were no recommended filenames. D'oh!
Fixed redefined parts() method in Entity.
Fixed bugs in Head where field name wasn't being case folded.
- Version 1.10
-
A typo was causing the epilogue of an inner multipart message to be swallowed to the end of the OUTER multipart message; this has now been fixed. Thanks to Igor Starovoitov for reporting this bug.
A bad regexp for parameter names was causing some parameters to be parsed incorrectly; this has also been fixed. Thanks again to Igor Starovoitov for reporting this bug.
It is now possible to get full control of the filenaming algorithm before output files are generated, and the default algorithm is safer. Thanks to Laurent Amon for pointing out the problems, and suggesting some solutions.
Fixed illegal "simple" multipart test file. D'OH!
- Version 1.9
-
No changes: 1.8 failed CPAN registration
- Version 1.8.
-
Fixed incompatibility with 5.001 and FileHandle::new_tmpfile Added COPYING file, and improved README.
Future plans
Dress up mimedump and mimeexplode utilities to take cmd line options for directory, environment vars (MIMEDUMP_OUTPUT, etc.).
Make it even easier to compose and send MIME messages.
Make VERSIONs a bit more sensible (e.g., 2.8, 2.9, 2.10 effectively goes backwards...).
TERMS AND CONDITIONS
Copyright (c) 1996 by Eryq. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
See the COPYING file in the distribution for details.
SUPPORT
Please email me directly with questions/problems (see AUTHOR below).
AUTHOR
MIME-tools was created by:
___ _ _ _ _ ___ _
/ _ \| '_| | | |/ _ ' / Eryq
| __/| | | |_| | |_| | http://www.mcs.net/~eryq
\___||_| \__, |\__, |__ eryq@enteract.com
|___/ |___/ eryq@rhine.gsfc.nasa.gov
Initial release (1.0): 28 April 1996. Re-release (2.0): Halloween 1996.
ACKNOWLEDGMENTS
This kit would not have been possible but for the direct contributions of the following:
Gisle Aas The MIME encoding/decoding modules
Laurent Amon Bug reports and suggestions
Graham Barr The new MailTools
Achim Bohnet Numerous good suggestions, including the I/O model
Kent Boortz Initial code for RFC-1522-decoding of MIME headers
Andreas Koenig Numerous good ideas, tons of beta testing,
and help with CPAN-friendly packaging
Igor Starovoitov Bug reports and suggestions
Jason L Tibbitts III Bug reports and suggestions
Not to mention the Accidental Beta Test Team, whose bug reports (and comments) have been invaluable in improving the whole:
Phil Abercrombie
Kurt Freytag
Jake Morrison
Rolf Nelson
Joel Noble
Tim Pierce
Andrew Pimlott
Dragomir R. Radev
Nickolay Saukh
Russell Sutherland
Larry Virden
Zyx
Please forgive me if I've accidentally left you out. Better yet, email me, and I'll put you in.
SEE ALSO
Users of this toolkit may wish to read the documentation of Mail::Header and Mail::Internet.
The MIME format is documented in RFCs 1521-1522, and more recently in RFCs 2045-2049.
The MIME header format is an outgrowth of the mail header format documented in RFC 822.