NAME

MIME::Entity - class for parsed-and-decoded MIME message

SYNOPSIS

Create a MIME entity from an array, and output it as a MIME stream to STDOUT:

    $ent = new MIME::Entity [
			  "Subject: Greetings\n",
			  "Content-type: text/plain\n",
			  "Content-transfer-encoding: 7bit\n",
			  "\n",
			  "Hi there!\n", 
			  "Bye there!\n"
        		  ];
    $ent->print(\*STDOUT);

Create a document for an ordinary 7-bit ASCII text file (lots of stuff is defaulted for us):

$ent = build MIME::Entity Path=>"english-msg.txt";

Create a document for a text file with 8-bit (Latin-1) characters:

$ent = build MIME::Entity Path     =>"french-msg.txt",
                          Encoding =>"quoted-printable",
                          -From    =>'jean.luc@inria.fr',
                          -Subject =>"C'est bon!";

Create a document for a GIF file (the description is completely optional, and note that we have to specify content-type and encoding since they're not the default values):

$ent = build MIME::Entity Description => "A pretty picture",
                          Path        => "./docs/mime-sm.gif",
                          Type        => "image/gif",
                          Encoding    => "base64";

Create a document that you already have the text for:

$ent = build MIME::Entity  Type        => "text/plain",
                           Encoding    => "quoted-printable",
                           Data        => [
                                 "First line.\n",
                                 "Second line.\n",
                                 "Last line.\n",
                           ];

Create a multipart message (could it be much easier?)

# Create the top-level, and set up the mail headers:
$top = build MIME::Entity Type     => "multipart/mixed",
                          -From    => 'me@myhost.com',
                          -To      => 'you@yourhost.com',
                          -Subject => "Hello, nurse!";

# Attachment #1: a simple text document: 
attach $top  Path=>"./testin/short.txt";

# Attachment #2: a GIF file:
attach $top  Path        => "./docs/mime-sm.gif",
             Type        => "image/gif",
             Encoding    => "base64";
 
# Attachment #3: text we'll create with text we have on-hand:
attach $top Data=>$contents;

# Output!
$top->print(\*STDOUT);

Muck about with the signature:

# Sign it (atomatically removes any existing signature):
$top->sign(File=>"$ENV{HOME}/.signature");
    
# Remove any signature within 15 lines of the end:
$top->remove_sig(15);

Extract information from MIME entities:

# Get the head, a MIME::Head:
$head = $ent->head;

# Get the body, as a MIME::Body;
$bodyh = $ent->bodyhandle;

If you want a Content-type: header to be output and output correctly for the current body part(s), here's how to do it:

# Compute content-lengths for singleparts based on bodies:
$entity->sync_headers(Length=>'COMPUTE');

# Output!
$entity->print(\*STDOUT);

See MIME::Parser for additional examples of usage.

DESCRIPTION

A subclass of Mail::Internet.

This package provides a class for representing MIME message entities, as specified in RFC 1521, Multipurpose Internet Mail Extensions.

Here are some excerpts from RFC-1521 explaining the terminology: each is accompanied by the equivalent in MIME:: terms:

Message

From RFC-1521:

The term "message", when not further qualified, means either the
(complete or "top-level") message being transferred on a network, or
a message encapsulated in a body of type "message".

There currently is no explicit package for messages; under MIME::, messages may be read in from readable files or filehandles. A future extension will allow them to be read from any object reference that responds to a special "next line" method.

Body part

From RFC-1521:

The term "body part", in this document, means one of the parts of the
body of a multipart entity. A body part has a header and a body, so
it makes sense to speak about the body of a body part.

Since a body part is just a kind of entity (see below), a body part is represented by an instance of MIME::Entity.

Entity

From RFC-1521:

The term "entity", in this document, means either a message or a body
part.  All kinds of entities share the property that they have a
header and a body.

An entity is represented by an instance of MIME::Entity. There are instance methods for recovering the header (a MIME::Head) and the body (see below).

Body

From RFC-1521:

The term "body", when not further qualified, means the body of an
entity, that is the body of either a message or of a body part.

Well, this is a toughie. Both Mail::Internet (1.17) and Mail::MIME (1.03) represent message bodies in-core; unfortunately, this is not always the best way to handle things, especially for MIME streams that contain multi-megabyte tar files.

PUBLIC INTERFACE

Constructors and converters

new [SOURCE]

Class method. Create a new, empty MIME entity. Basically, this uses the Mail::Internet constructor...

If SOURCE is an ARRAYREF, it is assumed to be an array of lines that will be used to create both the header and an in-core body.

Else, if SOURCE is defined, it is assumed to be a filehandle from which the header and in-core body is to be read.

Note: in either case, the body will not be parsed: merely read!

build PARAMHASH

Class/instance method. A quick-and-easy catch-all way to create an entity. Use it like this to build a "normal" single-part entity:

   $ent = build MIME::Entity Type     => "image/gif",
		             Encoding => "base64",
                             Path     => "/path/to/xyz12345.gif",
                             Filename => "saveme.gif",
                             Disposition => "attachment";

And like this to build a "multipart" entity:

$ent = build MIME::Entity Type     => "multipart/mixed",
                          Boundary => "---1234567";

A minimal MIME header will be created. If you want to add or modify any header fields afterwards, you can of course do so via the underlying head object... but hey, there's now a prettier syntax!

$ent = build MIME::Entity Type     =>"multipart/mixed",
                          -From         => $myaddr,
                          -Subject      => "Hi!",
                         '-X-Certified' => ['SINED','SEELED','DELIVERED'];

Normally, an X-Mailer header field is output which contains this toolkit's name and version (plus this module's RCS version). This will allow any bad MIME we generate to be traced back to us. You can of course overwrite that header with your own:

$ent = build MIME::Entity  Type       => "multipart/mixed",
                          '-X-Mailer' => "myprog 1.1";

Or remove it entirely:

$ent = build MIME::Entity  Type       => "multipart/mixed",
                          '-X-Mailer' => undef;

OK, enough hype. The parameters are:

-FIELDNAME

Any parameter with a leading '-' is taken to be a mail header field, whose value is to replace the corresponding header field after we go through all the other params and construct the basic MIME header. Use with care: you don't want to trash those nice MIME fields! Syntactic sugar, totally optional. TMTOWTDI.

Boundary

Multipart entities only. Optional. The boundary string. As per RFC-1521, it must consist only of the characters [0-9a-zA-Z'()+_,-./:=?] and space (you'll be warned, and your boundary will be ignored, if this is not the case). If you omit this, a random string will be chosen... which is probably safer.

Data

Single-part entities only. Optional. An alternative to Path (q.v.): the actual data, either as a scalar or an array reference (whose elements are joined together to make the actual scalar). The body is opened on the data using MIME::Body::Scalar.

Description

Optional. The text of the content-description. If you don't specify it, the field is not put in the header.

Disposition

Optional. The basic content-disposition ("attachment" or "inline"). If you don't specify it, it defaults to "inline" for backwards compatibility. Thanks to Kurt Freytag for suggesting this feature.

Encoding

Optional. The content-transfer-encoding. If you don't specify it, the field is not put in the header... which means that the encoding implicitly defaults to "7bit" as per RFC-1521. Do yourself a favor: put it in.

Filename

Single-part entities only. Optional. The recommended filename. Overrides any name extracted from Path. The information is stored both the deprecated (content-type) and preferred (content-disposition) locations.

Path

Single-part entities only. Optional. The path to the file to attach. The body is opened on that file using MIME::Body::File.

Top

Optional. Is this a top-level entity? If so, it must sport a MIME-Version. The default is true. (NB: look at how attach() uses it.)

Type

Optional. The basic content-type ("text/plain", etc.). If you don't specify it, it defaults to "text/plain" as per RFC-1521. Do yourself a favor: put it in.

Instance methods

add_part ENTITY

Instance method. Assuming we are a multipart message, add a body part (a MIME::Entity) to the array of body parts.

Warning: in the future, it may be a fatal error to attempt to attach a part to anything but a multipart entity (one with a content-type of multipart/*).

Returns the part that was just added.

attach [PART|PARAMHASH]

Instance method. The real quick-and-easy way to create multipart messages. Basically equivalent to:

$entity->add_part(ref($entity)->build(PARAMHASH, Top=>0));

Except that it's a lot nicer to look at.

It is a fatal error to attempt to attach a part to anything but a multipart entity (one with a content-type of multipart/*).

body [VALUE]

Instance method.

If emulating version 1.x:

Get or set the path to the file containing the body.

If VALUE is not given, the current body file is returned. If VALUE is given, the body file is set to the new value, and the previous value is returned.

Otherwise:

Get or set the body, as an array of lines. This should be regarded as a read-only data structure: changing its contents will have unpredictable results (you can, of course, make your own copy, and work with that).

Provided for compatibility with Mail::Internet, and it might not be as efficient as you'd like. Also, it's somewhat silly/wrongheaded for binary bodies, like GIFs and tar files.

Both forms are deprecated for MIME entities: instead, use the bodyhandle() method to get and use a MIME::Body. The content-type of the entity will tell you whether that body is best read as text (via getline()) or raw data (via read()).

bodyhandle [VALUE]

Instance method. Get or set an abstract object representing the body.

If VALUE is not given, the current bodyhandle is returned. If VALUE is given, the bodyhandle is set to the new value, and the previous value is returned.

dump_skeleton [FILEHANDLE]

Instance method. Dump the skeleton of the entity to the given FILEHANDLE, or to the currently-selected one if none given. This is really just useful for debugging purposes.

head [VALUE]

Instance method. Get/set the head.

If there is no VALUE given, returns the current head. If none exists, an empty instance of MIME::Head is created, set, and returned.

Note: This is a patch over a bug in Mail::Internet, which doesn't provide a method for setting the head to some given object.

is_multipart

Instance method. Does this entity's MIME type indicate that it's a multipart entity? Returns undef (false) if the answer couldn't be determined, 0 (false) if it was determined to be false, and true otherwise.

Note that this says nothing about whether or not parts were extracted.

mime_type

Instance method. A purely-for-convenience method. This simply relays the request to the associated MIME::Head object. The following are identical:

$x = $entity->mime_type;

$x = $entity->head->mime_type;

If there is no head, returns undef in a scalar context and the empty array in a list context.

Note that, while parsed entities still have MIME types, they do not have MIME encodings, or MIME versions, or fields, etc., etc... for those attributes, you still have to go to the head explicitly.

parts

Instance method. Return an array of all sub parts (each of which is a MIME::Entity), or the empty array if there are none.

For single-part messages, the empty array will be returned. For multipart messages, the preamble and epilogue parts are not in the list!

Note that in a scalar context, this returns you the number of parts.

Instance method, override. Print the entity to the given FILEHANDLE, or to the currently-selected one if none given.

If a single-part entity, the header and the body are both output, with the body being output according to the encoding specified by the header.

If a multipart entity, this is invoked recursively on all its parts, with appropriate boundaries and a preamble generated for you.

See print_body() for an important note on how the body is output.

Instance method, override. Print the body of the entity to the given FILEHANDLE, or to the currently-selected one if none given.

Important note: the body is output according to the encoding specified by the header ('binary' if no encoding given). This means that the following code:

    $ent = new MIME::Entity ["Subject: Greetings\n",
			     "Content-transfer-encoding: base64\n",
			     "\n",
			     "Hi there!\n", 
			     "Bye there!\n"
			     ];
    $ent->print;   # uses print_body() internally

Prints this:

Subject: Greetings
Content-transfer-encoding: base64

SGkgdGhlcmUhCkJ5ZSB0aGVyZSEK

The body is stored in an un-encoded form; however, the idea is that the transfer encoding is used to determine how it should be output. This means that the print() method is always guaranteed to get you a sendmail-ready stream whose body is consistent with its head.

If you want the raw body data to be output, you can either read it from the bodyhandle yourself, or use:

$ent->bodyhandle->print;

which uses read() calls to extract the information, and thus will work with both text and binary bodies.

Warning: Please supply a filehandle. This override method differs from Mail::Internet's behavior, which outputs to the STDOUT if no filehandle is given: this may lead to confusion.

purge

Instance method. Recursively purge all on-disk body parts in this message. This assumes that the path() method returns something reasonable for the "bodyhandle" object... MIME::Body::File and MIME::Body::Scalar do, at least.

I wouldn't attempt to read those body files after you do this, for obvious reasons. I probably should nuke the bodyhandle's path afterwards, but currently I don't. Don't gamble on this for the future, though.

Thanks to Jason L. Tibbitts III for suggesting this method.

remove_sig [NLINES]

Instance method, override. Attempts to remove a user's signature from the body of a message.

It does this by looking for a line matching /^-- $/ within the last NLINES of the message. If found then that line and all lines after it will be removed. If NLINES is not given, a default value of 10 will be used. This would be of most use in auto-reply scripts.

For MIME messages, this method is reasonably cautious: it will only attempt to un-sign a message with a content-type of text/*.

If you send this message to a multipart entity, it will relay it to the first part (the others usually being the "attachments").

Warning: currently slurps the whole message-part into core as an array of lines, so you probably don't want to use this on extremely long messages.

Returns truth on success, false on error.

sign PARAMHASH

Instance method, override. Append a signature to the message. The params are:

Attach

Instead of appending the text, try to add it to the message as an attachment. The disposition will be inline, and the description will indicate that it is a signature. Attaching is only done if the message type is multipart; otherwise, we try to append the signature to the text itself. MIME-specific; new in this subclass.

File

Use the contents of this file as the signature. Fatal error if it can't be read. As per superclass method.

Force

Sign it even if the content-type isn't text/*. Useful for non-standard types like x-foobar, but be careful! MIME-specific; new in this subclass.

Remove

Normally, we attempt to strip out any existing signature. If true, this gives us the NLINES parameter of the remove_sig call. If zero but defined, tells us not to remove any existing signature. If undefined, removal is done with the default of 10 lines. New in this subclass.

Signature

Use this text as the signature. You can supply it as either a scalar, or as a ref to an array of newline-terminated scalars. As per superclass method.

For MIME messages, this method is reasonably cautious: it will only attempt to sign a message with a content-type of text/*, unless Force is specified.

If you send this message to a multipart entity, it will relay it to the first part (the others usually being the "attachments").

Warning: currently slurps the whole message-part into core as an array of lines, so you probably don't want to use this on extremely long messages.

Returns true on success, false otherwise.

sync_headers OPTIONS

This method does a variety of activities which ensure that the MIME headers of an entity "tree" are in-synch with the body parts they describe. It can be as expensive an operation as printing if it involves pre-encoding the body parts; however, the aim is to produce fairly clean MIME. You will usually only need to invoke this if processing and re-sending MIME from an outside source.

The OPTIONS is a hash, which describes what is to be done.

Length

One of the "official unofficial" MIME fields is "Content-Length". Normally, one doesn't care a whit about this field; however, if you are preparing output destined for HTTP, you may. The value of this option dictates what will be done:

COMPUTE means to set a Content-Length field for every non-multipart part in the entity, and to blank that field out for every multipart part in the entity.

ERASE means that Content-Length fields will all be blanked out. This is fast, painless, and safe.

Any false value (the default) means to take no action.

Nonstandard

Any header field beginning with "Content-" is, according to the RFC, a MIME field. However, some are non-standard, and may cause problems with certain MIME readers which interpret them in different ways.

ERASE means that all such fields will be blanked out. This is done before the Length option (q.v.) is examined and acted upon.

Any false value (the default) means to take no action.

Returns a true value if everything went okay, a false value otherwise.

tidy_body

Instance method, override. Currently unimplemented for MIME messages. Does nothing, returns false.

NOTES

Under the hood

A MIME::Entity is composed of the following elements:

  • A head, which is a reference to a MIME::Head object containing the header information.

  • A bodyhandle, which is a reference a MIME::Body object containing the decoded body data. (In pre-2.0 releases, this was accessed via body, which was a path to a file containing the decoded body. Integration with Mail::Internet has forced this to change.)

  • A list of zero or more parts, each of which is a MIME::Entity object. The number of parts will only be nonzero if the content-type is some subtype of "multipart".

    Note that, in 2.0+, a multipart entity does not have a body. Of course, any/all of its component parts can have bodies.

Design issues

Some things just can't be ignored

In multipart messages, the "preamble" is the portion that precedes the first encapsulation boundary, and the "epilogue" is the portion that follows the last encapsulation boundary.

According to RFC-1521:

There appears to be room for additional information prior to the
first encapsulation boundary and following the final boundary.  These
areas should generally be left blank, and implementations must ignore
anything that appears before the first boundary or after the last one.

NOTE: These "preamble" and "epilogue" areas are generally not used
because of the lack of proper typing of these parts and the lack
of clear semantics for handling these areas at gateways,
particularly X.400 gateways.  However, rather than leaving the
preamble area blank, many MIME implementations have found this to
be a convenient place to insert an explanatory note for recipients
who read the message with pre-MIME software, since such notes will
be ignored by MIME-compliant software.

In the world of standards-and-practices, that's the standard. Now for the practice:

Some "MIME" mailers may incorrectly put a "part" in the preamble. Since we have to parse over the stuff anyway, in the future I may allow the parser option of creating special MIME::Entity objects for the preamble and epilogue, with bogus MIME::Head objects.

For now, though, we're MIME-compliant, so I probably won't change how we work.

AUTHOR

Copyright (c) 1996 by Eryq / eryq@rhine.gsfc.nasa.gov

All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

VERSION

$Revision: 3.204 $ $Date: 1997/01/22 08:38:36 $