NAME
Email::MIME::XPath - access MIME documents via XPath queries
VERSION
Version 0.001
SYNOPSIS
use Email::MIME;
use Email::MIME::XPath;
my $email = Email::MIME->new($data);
# find just the first text/plain node, no matter how many there are
my ($part) = $email->xpath_findnodes('//plain');
# find the only text/html node, and die if there is more than one
$part = $email->xpath_findnode('//html');
# look for a png by filename
$part = $email->xpath_findnode('//png[@filename="image.png"]');
# retrieve a part by previously-stored address
my $address = $part->xpath_address;
# ... later ...
$part = $email->xpath_findnode(qq{//*[@address="$address"]});
DESCRIPTION
Dealing with MIME messages can be complicated. Frequently you want to display certain parts of a message, while alluding to (linking, summarizing, whatever) other parts in a way that makes them easy to get to later. Sometimes this can go several levels deep, if you're dealing with forwarded messages, bounces, or reports of some kind.
It is especially referring back to sub-parts of an arbitrarily deep MIME message that is tedious and that this module attempts to make easier.
Most of this module's functionality is provided by Tree::XPathEngine. Refer to its documentation for details. In particular, each of these methods is just a wrapper around the method of the same name with xpath_
removed:
xpath_findnodes
xpath_findnodes_as_string
xpath_findvalue
xpath_exists
xpath_matches
xpath_find
Two other useful methods are made available by Email::MIME::XPath:
xpath_findnode
This is a wrapper around xpath_findnodes
that dies if more than one node is matched.
TODO: should this also die if no nodes are found?
xpath_address
This method returns a per-message unique address for a particular part. This address is also available as the 'address' attribute in XPath queries; see "Attributes".
DOM
XPath expects to work on a tree that is DOM-like. MIME documents are trees, and this module fakes up enough structure to make XPath useful.
Elements (MIME parts) are given a name
that corresponds to the second part of their Content-Type, e.g.
multipart/mixed = 'mixed'
text/plain = 'plain'
I am open to changing this. In particular, I would have just used the entire Content-Type, but using '/' in names would have been problematic and I didn't want to replace it with something else. Most of names should be unique, anyway; I've never seen 'multipart/png' or 'image/html'. Feel free to enlighten me.
Attributes
subject
from
to
cc
content_type
All of these attributes are pulled directly from the headers.
filename
For parts with a Content-Disposition header, the filename is pulled from it.
address
This attribute is assigned by Email::MIME::XPath as it crawls through the MIME structure (see "GUTS"). For any given top-level MIME document, the address attribute for each subpart will be stable over time. If you do your XPath queries from somewhere other than the top-level MIME part, the addresses will be different and probably not very useful.
Do not depend on any particular value for any particular address; it should only be used for temporary reference, not permanent storage. In particular, it may change between versions of Email::MIME::XPath, though such changes will be announced ahead of time. In the future, it may be possible to specify how addresses should be assigned on a per-application basis; presumably then they could be depended on.
GUTS
This module does a few odd things to work around unfriendly behavior in Email::MIME. For example, Email::MIME lets MIME parts be used in several larger MIME documents at once. Not only do individual parts not know what their parent is, they *can't* know, because a single part could be in multiple trees at once. Email::MIME::XPath tries to impose a tree structure on relevant MIME objects without getting in the way, but there are undoubtedly bugs and unexpected behavior that will arise.
TODO
Some of the XPath supported by Tree::XPathEngine doesn't work yet, in particular doing anything with siblings. Other syntax may work, but in general it is not yet thoroughly tested.
SEE ALSO
Tree::XPathEngine, Email::MIME
AUTHOR
Hans Dieter Pearcey, <hdp at cpan.org>
BUGS
Please report any bugs or feature requests to bug-email-mime-xpath at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Email-MIME-XPath. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Email::MIME::XPath
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
RT: CPAN's request tracker
Search CPAN
ACKNOWLEDGEMENTS
Thanks to Listbox.com, who sponsored the development of this module.
COPYRIGHT & LICENSE
Copyright 2007 Hans Dieter Pearcey, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.