NAME
MIME::Decoder - an object for decoding the body part of a MIME stream
SYNOPSIS
Decoding a data stream. Here's a simple filter program to read quoted-printable data from STDIN (until EOF) and write the decoded data to STDOUT:
use MIME::Decoder;
$decoder = new MIME::Decoder 'quoted-printable' or die "unsupported";
$decoder->decode(\*STDIN, \*STDOUT);
Encoding a data stream. Here's a simple filter program to read binary data from STDIN (until EOF) and write base64-encoded data to STDOUT:
use MIME::Decoder;
$decoder = new MIME::Decoder 'base64' or die "unsupported";
$decoder->encode(\*STDIN, \*STDOUT);
You can write and install your own decoders so that MIME::Decoder will know about them:
use MyBase64Decoder;
install MyBase64Decoder 'base64';
You can also test if an encoding is supported:
if (MIME::Decoder->supported('x-uuencode')) {
# we can uuencode!
}
DESCRIPTION
This abstract class, and its private concrete subclasses (see below) provide an OO front end to the actions of...
Decoding a MIME-encoded stream
Encoding a raw data stream into a MIME-encoded stream.
The constructor for MIME::Decoder takes the name of an encoding (base64
, 7bit
, etc.), and returns an instance of a subclass of MIME::Decoder whose decode()
method will perform the appropriate decoding action, and whose encode()
method will perform the appropriate encoding action.
PUBLIC INTERFACE
Standard interface
If all you are doing is using this class, here's all you'll need...
- new ENCODING
-
Class method. Create and return a new decoder object which can handle the given ENCODING.
my $decoder = new MIME::Decoder "7bit";
Returns the undefined value if no known decoders are appropriate.
- decode INSTREAM,OUTSTREAM
-
Decode the document waiting in the input handle INSTREAM, writing the decoded information to the output handle OUTSTREAM.
Read the section in this document on I/O handles for more information about the arguments. Note that you can still supply old-style unblessed filehandles for INSTREAM and OUTSTREAM.
- encode INSTREAM,OUTSTREAM
-
Encode the document waiting in the input filehandle INSTREAM, writing the encoded information to the output stream OUTSTREAM.
Read the section in this document on I/O handles for more information about the arguments. Note that you can still supply old-style unblessed filehandles for INSTREAM and OUTSTREAM.
- encoding
-
Return the encoding that this object was created to handle, coerced to all lowercase (e.g.,
"base64"
). - supported [ENCODING]
-
Class method. With one arg (an ENCODING name), returns truth if that encoding is currently handled, and falsity otherwise. The ENCODING will be automatically coerced to lowercase:
if (MIME::Decoder->supported('7BIT')) { # yes, we can handle it... } else { # drop back six and punt... }
With no args, returns all the available decoders as a hash reference... where the key is the encoding name (all lowercase, like '7bit'), and the associated value is true (it happens to be the name of the class that handles the decoding, but you probably shouldn't rely on that). Hence:
my $supported = MIME::Decoder->supported; if ($supported->{7bit}) { # yes, we can handle it... } elsif ($supported->{8bit}) { # yes, we can handle it... }
You may safely modify this hash; it will not change the way the module performs its lookups. Only
install
can do that.Thanks to Achim Bohnet for suggesting this method.
Subclass interface
If you are writing (or installing) a new decoder subclass, there are some other methods you'll need to know about:
- decode_it INSTREAM,OUTSTREAM
-
Abstract instance method. The back-end of the decode method. It takes an input handle opened for reading (INSTREAM), and an output handle opened for writing (OUTSTREAM).
If you are writing your own decoder subclass, you must override this method in your class. Your method should read from the input handle via
getline()
orread()
, decode this input, and print the decoded data to the output handle viaprint()
. You may do this however you see fit, so long as the end result is the same.Note that unblessed references and globrefs are automatically turned into I/O handles for you by
decode()
, so you don't need to worry about it.Your method must return either
undef
(to indicate failure), or1
(to indicate success). - encode_it INSTREAM,OUTSTREAM
-
Abstract instance method. The back-end of the encode method. It takes an input handle opened for reading (INSTREAM), and an output handle opened for writing (OUTSTREAM).
If you are writing your own decoder subclass, you must override this method in your class. Your method should read from the input handle via
getline()
orread()
, encode this input, and print the encoded data to the output handle viaprint()
. You may do this however you see fit, so long as the end result is the same.Note that unblessed references and globrefs are automatically turned into I/O handles for you by
encode()
, so you don't need to worry about it.Your method must return either
undef
(to indicate failure), or1
(to indicate success). - init ARGS...
-
Instance method. Do any necessary initialization of the new instance, taking whatever arguments were given to
new()
. Should return the self object on success, undef on failure. - install ENCODING
-
Class method. Install this class so that ENCODING is handled by it. You should not override this method.
BUILT-IN DECODER SUBCLASSES
You don't need to "use"
any other Perl modules; the following are included as part of MIME::Decoder.
MIME::Decoder::Base64
The built-in decoder for the "base64"
encoding.
The name was chosen to jibe with the pre-existing MIME::Base64 utility package, which this class actually uses to translate each line.
When decoding, the input is read one line at a time. The input accumulates in an internal buffer, which is decoded in multiple-of-4-sized chunks (plus a possible "leftover" input chunk, of course).
When encoding, the input is read 45 bytes at a time: this ensures that the output lines are not too long. We chose 45 since it is a multiple of 3 and produces lines under 76 characters, as RFC-1521 specifies.
Thanks to Phil Abercrombie for locating one idiotic bug in this module, which led me to discover another.
MIME::Decoder::Binary
The built-in decoder for a "binary"
encoding (in other words, no encoding).
The "binary"
decoder is a special case, since it's ill-advised to read the input line-by-line: after all, an uncompressed image file might conceivably have loooooooooong stretches of bytes without a "\n"
among them, and we don't want to risk blowing out our core. So, we read-and-write fixed-size chunks.
Both the encoder and decoder do a simple pass-through of the data from input to output.
MIME::Decoder::QuotedPrint
The built-in decoder the for "quoted-printable"
encoding.
The name was chosen to jibe with the pre-existing MIME::QuotedPrint utility package, which this class actually uses to translate each line.
The decoder does a line-by-line translation from input to output.
The encoder does a line-by-line translation, breaking lines so that they fall under the standard 76-character limit for this encoding.
Note: just like MIME::QuotedPrint, we currently use the native "\n"
for line breaks, and not CRLF
. This may need to change in future versions.
MIME::Decoder::Xbit
The built-in decoder for both "7bit"
and "8bit"
encodings, which guarantee short lines (a maximum of 1000 characters per line) of US-ASCII data compatible with RFC-821.
The decoder does a line-by-line pass-through from input to output, leaving the data unchanged except that an end-of-line sequence of CRLF is converted to a newline "\n".
The encoder does a line-by-line pass-through from input to output, splitting long lines if necessary. If created as a 7-bit encoder, any 8-bit characters are mapped to zero or more 7-bit characters: note that this is a potentially lossy encoding if you hand it anything but 7-bit input: therefore, don't use it on binary files (GIFs) and the like; use it only when it "doesn't matter" if extra newlines are inserted and 8-bit characters are squished.
There are several possible ways to use this class to encode arbitrary 8-bit text as 7-bit text:
- Don't use this class.
-
Really. Use a more-appropriate encoding, like quoted-printable.
- APPROX
-
Approximate the appearance of the Latin-1 character via Internet conventions; e.g.,
"\c,"
,"\n~"
, etc. This is the default behavior of this class. - CLEARBIT8
-
Just clear the 8th bit. Yuck. Sort of a sledgehammer approach. Not recommended at all.
- ENTITY
-
Output as an HTML-style entity; e.g.,
"&
#189;"
. This sounds like a good idea, until you see some French text which is actually encoded this way... yuck. You're better off with quoted-printable. - STRIP
-
Strip out any 8-bit characters. Nice if you're really sure that any such characters in your input are mistakes to be deleted, but it'll transform non-English documents into an abbreviated mess.
NOTES
Input/Output handles
As of MIME-tools 2.0, this class has to play nice with the new MIME::Body class... which means that input and output routines cannot just assume that they are dealing with filehandles.
Therefore, all that MIME::Decoder and its subclasses require (and, thus, all that they can assume) is that INSTREAMs and OUTSTREAMs are objects which respond to the messages defined in MIME::IO (basically, a subset of those defined by IO::Handle).
For backwards compatibilty, if you supply a scalar filehandle name (like "STDOUT"
) or an unblessed glob reference (like \*STDOUT
) where an INSTREAM or OUTSTREAM is expected, this package will automatically wrap it in an object that fits the I/O handle criteria.
Thanks to Achim Bohnet for suggesting this more-generic I/O model.
Writing a decoder
If you're experimenting with your own encodings, you'll probably want to write a decoder. Here are the basics:
Create a module, like "MyDecoder::", for your decoder. Declare it to be a subclass of MIME::Decoder.
Create the following instance methods in your class, as described above:
decode_it encode_it init
In your application program, activate your decoder for one or more encodings like this:
require MyDecoder; install MyDecoder "7bit"; # use MyDecoder to decode "7bit" install MyDecoder "x-foo"; # also, use MyDecoder to decode "x-foo"
To illustrate, here's a custom decoder class for the quoted-printable
encoding:
package MyQPDecoder;
@ISA = qw(MIME::Decoder);
use MIME::Decoder;
use MIME::QuotedPrint;
# decode_it - the private decoding method
sub decode_it {
my ($self, $in, $out) = @_;
while (defined($_ = $in->getline())) {
my $decoded = decode_qp($_);
$out->print($decoded);
}
1;
}
# encode_it - the private encoding method
sub encode_it {
my ($self, $in, $out) = @_;
my ($buf, $nread) = ('', 0);
while ($in->read($buf, 60)) {
my $encoded = encode_qp($buf);
$out->print($encoded);
}
1;
}
That's it.
The task was pretty simple because the "quoted-printable"
encoding can easily be converted line-by-line... as can even "7bit"
and "8bit"
(since all these encodings guarantee short lines, with a max of 1000 characters). The good news is: it is very likely that it will be similarly-easy to write a MIME::Decoder for any future standard encodings.
The "binary"
decoder, however, really required block reads and writes: see "MIME::Decoder::Binary" for details.
SEE ALSO
MIME::Decoder, MIME::Entity, MIME::Head, MIME::Parser.
AUTHOR
Copyright (c) 1996 by Eryq / eryq@rhine.gsfc.nasa.gov
All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
VERSION
$Revision: 2.9 $ $Date: 1997/01/03 21:06:09 $