NAME

Email::Store - Framework for database-backed email storage

SYNOPSIS

use Email::Store 'dbi:mysql:mailstore';
Email::Store->setup; # Do this once

Email::Store::Mail->store( $rfc822 );
Email::Store::Mail->retrieve( $msgid );

...

DESCRIPTION

Email::Store is the ideal basis for any application which needs to deal with databases of email: archiving, searching, or even storing mail for implementing IMAP or POP3 servers.

Email::Store itself is a very lightweight framework, meaning it does not provide very much functionality itself; in effect, it is merely a Class::DBI interface to a database schema which is designed for storing email. Incidentally, if you don't know much about Class::DBI, you're going to need to in order to get much out of this.

Despite its minimalist nature, Email::Store is incredibly powerful. Its power comes from its extensibility, through plugin modules and hooks which allow you to add new database tables and concepts to the system, and so access the mail store from a "different direction". In a sense, Email::Store is a blank canvas, onto which you can pick and choose (or even write!) the plugins which you want for your application.

For instance, the core Email::Store::Entity plugin module addresses the idea of "people" in the email universe, allowing you to search for mails to or from particular people; (despite their changing names or email addresses) Email::Store::Thread interfaces Email::Store to Mail::Thread allowing you to navigate mails by their position in a mail thread; the planned non-core Email::Store::Plucene module plugs into the indexing process and stores information about emails in a Plucene search index for quick retrieval later, and so on.

OPTIONS

The generic way to use Email::Store is

use Email::Store 'dbi:mysql:mailstore';

However you can also pass a has ref in as the first argument.

This hash ref can contain arbitary key/value pairs however the only ones currently supported are only and except which both take an array ref of plugin names. only means that Email::Store will only load those plugins and except means it will load everything except those.

use Email::Store { only => [ "Email::Store::Mail" ] }, 'dbi:mysql:mailstore';

Core Email::Store modules

To get you started with a useful database, Email::Store provides a few core plugin modules which comprise the basics of a mailstore. Each of the modules provides one or more database tables, representing important concepts in the email world, and one or more relationships between these concepts and the other tables in the system. It's a little less complicated than that, as we'll see when we go through each module in turn. Here is a quick summary of what the core modules do:

Email::Store::Mail

This is, in a sense, the plugin of plugins. Email::Store::Mail encapsulates individual email messages. Its store method is the means in which emails are indexed and enter the mailstore. How this storing is done, however, is influenced by all the other plugins.

Email::Store::List

List is one of the easiest plugins to understand. To our concept of the mail, it adds the concept of a mailing list.

Email::Store::List hooks into the indexing process and examines a mail to see if it came via a mailing list; if so, it associates the mail with one or more lists. This means you can ask a mail object for its lists, and a list object for its posts. Because of this, instead of looking at messages by their message ID, you can start by looking for a mailing list you're interested in and then navigate to the messages you want.

Email::Store::Date

This adds the date method to a mail object, returning a Time::Piece representing the date of the email.

Email::Store::Entity

Entity is the most fundamental of the plugins but (or perhaps, "thus") the most complex. This module adds the concept of an addressing, which abstracts out the From, To, Cc and Bcc headers of an email. A "To" header, for instance, says that the mail is addressed to a particular name and address, but Email::Store::Entity also provides the potential for associating different names and addresses with the concept of an entity, a unique individual. That is, not all mails addressed to the name "Simon Cozens" are to me (due to the existence of multiple Simon Cozenses in the world) but all mails to .*@simon-cozens.org are, despite their being multiple email addresses which match that pattern.

If that has you confused, (and believe me, it has me confused) ignore the "entity" bit and know that you can navigate from names, addresses and the intersection of the two, to emails involving them. More details in Email::Store::Entity as you'd expect.

Email::Store::Attachment

As you might be able to guess, this adds the concept of an attachment. It also ambushes the indexing process, and strips all the MIME attachments off an email, placing them in the attachments table. It then quietly slips the de-MIMEd email back into the mail table, and now you can ask a mail for its attachments.

All these modules have some degree of POD, so you should consult them for more details on the interface that they provide. Over time, there will be additional modules that you can install from CPAN.

USAGE

When you use Email::Store, you should pass a DBI connection string to its use statement:

use Email::Store 'dbi:SQLite:dbname=mailstore.db';

In order to create the tables used by the plugin modules, you should then say

Email::Store->setup;

You should do this on the initial set-up of your database, and then again on installing any additional plugin modules, to create the new tables they want to use. Note that this does not retroactively index existing mail with the new functions provided by the modules you've just installed! - a reindex method is planned, but is not there yet.

This is all the functionality that Email::Store itself provides. See the documentation to the various plugins for their public interface, chiefly Email::Store::Mail.

THE PLUGIN SYSTEM

If you want to write your own plugins, you will need to know how the plugin system works.

The first thing to note is that when Email::Store indexes a mail, whether for the first time or when it re-indexes a mail it's seen before, it loads up all the modules it can find under the Email::Store::* hierarchy. Additionally, when Email::Store->setup is called, all the Email::Store::* modules are required. So, to register your new plugin, all you need to do is call it Email::Store::something and put it in Perl's include path in the usual way.

Each plugin module should be a self-contained description of some concepts, the database schema that encapsulates them, their relationship to the rest of the system, and any hooks or additional functionality provided.

Let us write a very simple plugin as a first example. This will introduce the concept of a mail annotation, an open-ended space where we can store "sticky notes" which relate to a particular email. We'll call the plugin Email::Store::Annotation, and we start by putting the following in Email/Store/Annotation.pm:

package Email::Store::Annotation;
use base 'Email::Store::DBI';

This makes us a Class::DBI-based package. Next we need to do the usual Class::DBI thing and ddeclare our table and columns:

Email::Store::Annotation->table("mail_annotation");
Email::Store::Annotation->columns(All => qw/id mail content/);

Next we declare how this fits into the rest of the world: an Email::Store::Mail has many annotations:

Email::Store::Mail->has_many(annotations => "Email::Store::Annotation");

Annotations are something that the utility which uses Email::Store is going to create, modify and delete manually; we can hardly auto-generate a user-defined annotation when a mail is indexed, so we don't need to define any hooks into the indexing process. In fact, this is all the code we need to write, so we end the package in the usual way:

1;

If we did need to hook into a different part of Email::Store, we'd have to use Module::Pluggable::Ordered's plugin mechanism. See Email::Store::Mail for the hooks provided and how to hook into them.

But where does this mail_annotation table come from? How does Email::Store know how to create it? The answer comes when we put the schema into the __DATA__ section: Email::Store->setup reads all the DATA sections for the plugins that it finds, and executes them as SQL in the database. As pretty much every database's SQL is subtly different, the schema should be written in MySQL's SQL and Email::Store will magically translate it for the database in use:

__DATA__
CREATE TABLE IF NOT EXISTS mail_annotation (
    id INTEGER auto_increment NOT NULL PRIMARY KEY,
    mail INTEGER,
    content TEXT
);

With this module complete and installed, an Email::Store user can now say:

my $mail = Email::Store::Mail->retrieve( $msg_id );
$mail->add_to_annotations({ content => "I like this mail" });
print "Things I know about this mail:\n";
print $_->content, "\n" for $mail->annotations;

The really big advantage of this architecture is that everything about a concept and its relationship to the mailstore is encapsulated in a single file and can be dropped in and out at will, without disturbing the rest of the code. This is fantastic extensibility. Email::Store does not need to define a schema of every single table you might possibly need up front, but everything is modularised.

The really big disadvantage is that the interface of one part of the system, such as Email::Store::Mail isn't collected in one place, but gets added to by pretty much every other plugin that gets loaded up. If you look in the Email::Store::Mail POD you'll see nothing about the add_to_annotations method that we've just called.

However, since every plugin should document its interface thoroughly and its relationship to other parts of the system, this should not really be a problem for end-users.

SEE ALSO

Understanding Class::DBI is fundamental to using Email::Store.

The core modules: Email::Store::Mail, Email::Store::List, Email::Store::Entity, Email::Store::Thread, Email::Store::Attachment. Please do read through their documentation to see the whole of the Email::Store API.

Any other Email::Store::* modules you find on CPAN.

Module::Pluggable::Ordered is the pluggable hooks system used throughout Email::Store. Those developing additional modules might want to look at its documentation to understand how to hook into the indexing, reindexing and other processes.

AUTHOR

Simon Cozens, <simon@cpan.org>

CREDITS

Many of the ideas (although none of the code) for this package were taken from my work on a project called Twingle, for a company called Kasei. While I was at Kasei, I did a lot of thinking about handling email, storing it, analyzing it and presenting it on the web; I left the company with all that knowledge in my head, and wrote Email::Store with the knowledge and tools that I acquired. Thanks to Kasei for the experience that I gained there and the good grace they've shown as I release Email::Store.

COPYRIGHT AND LICENSE

Copyright 2004 by Simon Cozens

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.