NAME
Mail::Box::Threads - maintain threads within a folder
SYNOPSIS
my Mail::Box $folder = ...;
foreach my $thread ($folder->threads)
{ $thread->printThread;
}
DESCRIPTION
Read Mail::Box::Manager and Mail::Box first. The manual also describes package Mail::Box::Thread, which is one thread.
A (message-)thread is a message, with the messages which followed in reply on that message. And the messages with replied the messages which replied the original message. And so on. Some threads are only one message (never replied to), some threads are very long.
What can we do?
This module implements thread-detection on a folder. Messages created by the better mailers will include In-Reply-To
and References
lines, which are used to figure out how messages are related. If you prefer a better thread detection, then you can ask for it, but there may be a serious performance hit (depends on the type of folder used).
How to use it?
With threads
you get the start-messages of each thread of this folder. When that message was not found in the folder (not saved or already removed), you get a message of the dummy-type. These thread descriptions are in perfect state: all messages of the folder are included somewhere, and each missing message of the threads (`holes') are filled by dummies.
However, to be able to detect all threads it is required to have the headers of all messages, which is very slow for some types of folders, especially MH and IMAP folders.
For interactive mail-readers, it is prefered to detect threads only on messages which are in the viewport of the user. This may be sloppy in some situations, but everything is preferable over reading an MH mailbox with 10k e-mails to read only the see most recent messages.
In this object, we take special care not to cause unnecessary parsing (loading) of messages. Threads will only be detected on command, and by default only the message headers are used.
- How is it implemented?
-
The user of the folder signals that a message has to be included in a thread within the thread-list, by calling
$folder->inThread($message); #or $message->inThread;
This only takes the information from this message, and stores this in a thread-structure. You can also directly ask for the thread where the message is in:
my $thread = $message->thread;
When the message was not put in a thread, it is done now. But, more work is done to return the best thread. Based on various parameters, which where specified when the folder was created, the method walks through the folder to fill the holes which are in this thread.
Walking from back to front (latest messages are usually in the back of the folder), message after message are triggered to be indexed in their thread. At a certain moment, the whole thread of the requested method is found, a certain maximum number of messages was tried, but that didn't help (search window bound reached), or the messages within the folder are getting too old. Then the search to complete the thread will end, although more messages of the could be in the folder.
Finally, for each message where the head is known, for instance for all messages in mbox-folders, the correct thread is determined immediately. Also, all messages where the head get loaded later, are automatically included.
PUBLIC INTERFACE
- new ARGS
-
Mail::Box::Threads is sub-classed by Mail::Box itself. This object is not meant to be instantiated itself: do not call
new
on it (you'll see it even fails because there is nonew()
!).The construction of thread administration accepts the following options:
dummy_type => CLASS
Of which class are dummy messages? Usually, this needs to be the
message_type
of the folder prepended with::Dummy
. This will also be the default.thread_window => INTEGER|'ALL'
The thread-window describes how many messages should be checked at maximum to fill `holes' in threads for folder which use delay-loading of message headers. The default value is 10.
The constant 'ALL' will cause thread-detection not to stop trying to fill holes, but continue looking until the first message of the folder is reached. Gives the best quality results, but may perform bad.
thread_timespan => TIME|'EVER'
Specify how fast threads usually work: the amount of time between an answer and a reply. This is used in combination with the
thread_window
option to determine when to give-up filling the holes in threads.TIME is a string, which starts with a float, and then one of the words 'hour', 'hours', 'day', 'days', 'week', or 'weeks'. For instance:
thread_timespan => '1 hour' thread_timespan => '4 weeks'
The default is '3 days'. TIME may also be the string 'EVER', which will effectively remove this limit.
thread_body => BOOL
May thread-detection be based on the content of a message? This has a serious performance implication when there are many messages without
In-Reply-To
andReferences
headers in the folder, because it will cause many messages to be parsed.NOT USED YET. Defaults to FALSE.
=back
toBeThreaded MESSAGE [, ...]
toBeUnthreaded MESSAGE-ID [, ...]
Register a message to be put in (withdrawn from) a thread when the user is asking for threads. If no-one ever asks for threads, then no work is done on them.
createDummy MESSAGE-ID
Create a dummy message for this folder. The dummy is a place-holder in a thread description to represent a message which is not found in the folder (yet).
processDelayedThreading
Parse all messages which where detected in the folder, but were not processed into a thread yet.
thread MESSAGE
Based on a message, and facts from previously detected threads, try to build solid knowledge about the thread where this message is in.
inThread MESSAGE
Collect the thread-information of one message. The `In-Reply-To' and `Reference' header-fields are processed. If this method is called on a message whose header was not read yet (as usual for MH-folders, for instance) the reading of that header will be triggered here.
Examples: $folder->inThread($message); $message->inThread; #same
outThread MESSAGE-ID
Remove the message, which is represented by its message-id, from the thread-infrastructure. A message is replaced by a dummy, if it has follow-ups.
registerThread MESSAGE|MESSAGE-ID
Register the message as start of a thread.
threads
Returns all messages which start a thread. The list may contain dummy messages, and messages which are scheduled for deletion.
To be able to return all threads, thread construction on each message is performed first, which may be slow for some folder-types because is will enforce parsing of message-bodies.
knownThreads
Returns the list of all messages which are known to be the start of a thread. Threads containing messages which where not read from their folder (like often happends MH-folder messages) are not yet known, and hence will not be returned.
The list may contain dummy messages, and messages which are scheduled for deletion. Threads are detected based on explicitly calling C<inThread> and C<thread> with a messages from the folder.
Be warned that, each time a message's header is read from the folder, the return of the method can change.
Mail::Box::Thread
A thread implements a list of messages which are related. The main object described in the manual-page is the thread-manager, which is part of a Mail::Box. The Mail::Box::Thread is sub-classed by a Mail::Box::Message; each message is part of a thread.
- new ARGS
-
The instatiation of a thread is done by its subclasses. You will not call this method by yourself (it is even not implemented).
In the current implementation, there are no options added to the Mail::Box::Message's object creation.
- thread
-
Returns the first message in the thread where this message is part of. This may be this message itself. This also may return any other message in the folder. Even a dummy message can be returned, when the first message in the thread was not stored in the folder.
Example: my $start = $folder->message(42)->thread;
- inThread
-
Include the message in a thread. If the message was not known to the thread-administration yet, it will be added to those structures.
- repliedTo
-
Returns the message where this one is a reply to. In SCALAR context, this will return the MESSAGE which was replied to by this one. This message object may be a dummy message. In case the message seems to be the first message of a thread, the value
undef
is returned.In LIST context, this method also returns how sure these are messages are related. When extended thread discovery in enabled, then some magic is applied to relate messages. In LIST context, the first returned argment is a MESSAGE, and the second a STRING constant. Values for the STRING may be:
REPLY
This relation was directly derived from an `in-reply-to' message header field. The relation is very sure.
REFERENCE
This relation is based on information found in a `Reference' message header field. One message may reference a list of messages which precede it in the thread. Let's hope they are stored in the right order.
GUESS
The relation is a big guess, of undetermined type.
More constants may be added later.
Examples: my $question = $answer->repliedTo; my ($question, $quality) = $answer->repliedTo; if($question && $quality eq 'REPLY') { ... };
- follows MESSAGE|MESSAGE-ID, STRING
-
Register that the specified MESSAGE (or MESSAGE-ID) is a reply on this message, where the quality of the relation is specified by the constant STRING.
The relation may be specified more than once, but there can be only one. Once a reply (STRING equals
REPLY
) is detected, that value will be kept. - followedBy [MESSAGE-ID|MESSAGE, ...]
-
Register that the MESSAGEs (or MESSAGE-IDs) are follow-ups to this message. There may be more than one of these follow-ups which are not related to each-other in any other way than sharing the same parent.
If the same relation is defined more than ones, this will not cause duplication of information.
- followUps
-
Returns the list of follow-ups to this message. This list contains parsed, not-parsed, and dummy messages.
followUps
returns MESSAGE-objects, whilefollowUpIDs
returns the IDs only. - threadFilled [BOOL]
-
Returns (after setting) a flag whether the thread (where this message is the start of) is fully processed in finding holes. If this is set on TRUE, than any dummies still in this thread could not be found within the limits of
thread_window
andthread_timespan
.
Actions on whole threads
Some conveniance methods are added to threads, to simplify retreiving knowledge from it.
- recurseThread CODE-REF
-
Execute a function for all sub-threads. If the subroutine returns true, sub-threads are visited, too. Otherwise, this branch is aborted.
- totalSize
-
Sum the size of all the messages in the thread.
- nrMessages
-
Number of messages in this thread.
- ids
-
Collect all the ids in this thread.
Examples:
$newfolder->addMessages($folder->ids($thread->ids)); $folder->delete($thread->ids);
- folded [BOOL]
-
Returns whether this (part of the) folder has to be shown folded or not. This is simply done by a label, which means that most folder-types can store this.
- threadToString
-
Translate a thread into a string. The string will contain at least one line for each message which was found, but tries to fold dummies. This is useful for debugging, but most message-readers will prefer to implement their own thread printer.
Example: print $message->threadToString;
may result in Subject of this message |- Re: Subject of this message |-*- Re: Re: Subject of this message | |- Re(2) Subject of this message | |- [3] Re(2) Subject of this message | `- Re: Subject of this message (reply) `- Re: Subject of this message
The `*' represents a lacking message. The `[3]' presents a folded thread with three messages.
AUTHOR
Mark Overmeer (Mark@Overmeer.net). All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
VERSION
This code is alpha, version 0.92
15 POD Errors
The following errors were encountered while parsing the POD:
- Around line 56:
'=item' outside of any '=over'
- Around line 89:
You forgot a '=back' before '=head1'
- Around line 143:
Expected '=item *'
- Around line 184:
Expected '=item *'
- Around line 186:
Expected '=item *'
- Around line 213:
Expected '=item *'
- Around line 231:
Expected '=item *'
- Around line 256:
Expected '=item *'
- Around line 320:
Expected '=item *'
- Around line 411:
Expected '=item *'
- Around line 426:
Expected '=item *'
- Around line 442:
Expected '=item *'
- Around line 468:
Expected '=item *'
- Around line 496:
You forgot a '=back' before '=head1'
- Around line 855:
You forgot a '=back' before '=head1'