=head1 NAME
Mail::Box::Thread::Manager - maintain threads within a set of folders
=head1 INHERITANCE
Mail::Box::Thread::Manager
is a Mail::Reporter
=head1 SYNOPSIS
my
$mgr
= Mail::Box::Thread::Manager->new;
my
$folder
=
$mgr
->
open
(
folder
=>
'/tmp/inbox'
);
my
$threads
=
$mgr
->threads(
folder
=>
$folder
);
my
$threads
=
$mgr
->threads(
$folder
);
foreach
my
$thread
(
$threads
->all) {
$thread
->
print
;
}
$threads
->includeFolder(
$folder
);
$threads
->removeFolder(
$folder
);
=head1 DESCRIPTION
A (message-)I<thread> is a message
with
links to messages which followed in
reply of that message. And then the messages
with
replied to the messages,
which replied the original message. And so on. Some threads are only
one message long (never replied to), some threads are very long.
The C<Mail::Box::Thread::Manager> is very powerful. Not only is it able to
do
a descent job on MH-like folders (makes a trade-off between perfection
and speed), it also can maintain threads from messages residing in different
opened folders. Both facilities are rare
for
mail-agents. The manager
creates flexible trees
with
L<Mail::Box::Thread::Node|Mail::Box::Thread::Node> objects.
=head1 METHODS
=head2 Constructors
Mail::Box::Thread::Manager-E<gt>B<new>(OPTIONS)
=over 4
A C<Mail::Box::Thread::Manager> object is usually created by a
L<Mail::Box::Manager|Mail::Box::Manager>. One manager can produce more than one of these
objects. One thread manager can combine messages from a set of folders,
which may be partially overlapping
with
other objects of the same type.
Option --Defined in --Default
dummy_type Mail::Message::Dummy
folder [ ]
folders [ ]
log
Mail::Reporter
'WARNINGS'
thread_body <false>
thread_type Mail::Box::Thread::Node
timespan
'3 days'
trace Mail::Reporter
'WARNINGS'
window 10
.
dummy_type
=> CLASS
=over 4
The type of dummy messages. Dummy messages are used to fill holes in
detected threads: referred to by messages found in the folder, but itself
not in the folder.
=back
.
folder
=> FOLDER | REF-ARRAY-FOLDERS
=over 4
Specifies which folders are to be covered by the threads. You can
specify one or more
open
folders. When you
close
a folder, the
manager will automatically remove the messages of that folder from
your threads.
=back
.
folders
=> FOLDER | REF-ARRAY-FOLDERS
=over 4
Equivalent to the C<folder> option.
=back
.
log
=> LEVEL
.
thread_body
=> BOOLEAN
=over 4
May thread-detection be based on the content of a message? This
has
a serious performance implication
when
there are many messages without
C<In-Reply-To> and C<References> headers in the folder, because it
will cause many messages to be parsed. NOT IMPLEMENTED YET.
=back
.
thread_type
=> CLASS
=over 4
Type of the thread nodes.
=back
.
timespan
=> TIME |
'EVER'
=over 4
Specify how fast threads usually work: the amount of
time
between an
answer and a reply. This is used in combination
with
the C<window>
option to determine
when
to give-up filling the holes in threads.
See Mail::Box::timespan2seconds()
for
the possibilities
for
TIME.
With
'EVER'
, the search
for
messages in a thread
will only be limited by the window-size.
=back
.
trace
=> LEVEL
.
window
=> INTEGER|
'ALL'
=over 4
The thread-window describes how many messages should be checked at
maximum to fill `holes' in threads
for
folder which
use
delay-loading
of message headers.
The constant
'ALL'
will cause thread-detection not to stop trying
to fill holes, but
continue
looking
until
the first message of the folder
is reached. Gives the best quality results, but may perform bad.
=back
I<Example:>
my
$mgr
= new Mail::Box::Manager;
my
$inbox
=
$mgr
->
open
(
folder
=>
$ENV
{MAIL});
my
$read
=
$mgr
->
open
(
folder
=>
'Mail/read'
);
my
$threads
=
$mgr
->threads(
folders
=> [
$inbox
,
$read
]);
my
$threads
=
$mgr
->threads;
$threads
->includeFolder(
$inbox
);
$threads
->includeFolder(
$read
);
=back
=head2 Grouping Folders
$obj
-E<gt>B<folders>
=over 4
Returns the folders as managed by this threader.
=back
$obj
-E<gt>B<includeFolder>(FOLDERS)
=over 4
Add one or more folders to the list of folders whose messages are
organized in the threads maintained by this object. Duplicated
inclusions will not cause any problems.
From the folders, the messages which have their header lines parsed
(see L<Mail::Box|Mail::Box> about lazy extracting) will be immediately scanned.
Messages of which the header is known only later will have to report this
(see L<toBeThreaded()|Mail::Box::Thread::Manager/
"Internals"
>).
I<Example:>
$threads
->includeFolder(
$inbox
,
$draft
);
=back
$obj
-E<gt>B<removeFolder>(FOLDERS)
=over 4
Remove one or more folders from the list of folders whose messages are
organized in the threads maintained by this object.
I<Example:>
$threads
->removeFolder(
$draft
);
=back
=head2 The Threads
$obj
-E<gt>B<all>
=over 4
Returns all messages which start a thread. The list may contain dummy
messages and messages which are scheduled
for
deletion.
To be able to
return
all threads, thread construction on
each
message is performed first, which may be slow
for
some folder-types
because is will enforce parsing of message-bodies.
=back
$obj
-E<gt>B<known>
=over 4
Returns the list of all messages which are known to be the start of
a thread. Threads containing messages which where not
read
from their
folder (like often happens MH-folder messages) are not yet known, and
hence will not be returned.
The list may contain dummy messages, and messages which are scheduled
for
deletion. Threads are detected based on explicitly calling
L<inThread()|Mail::Box::Thread::Manager/
"Internals"
> and L<thread()|Mail::Box::Thread::Manager/
"The Threads"
>
with
a messages from the folder.
Be warned that,
each
time
a message's header is
read
from the folder,
the
return
of the method can change.
=back
$obj
-E<gt>B<sortedAll>([PREPARE [COMPARE]])
=over 4
Returns L<all()|Mail::Box::Thread::Manager/
"The Threads"
> the threads by
default
, but sorted on timestamp.
=back
$obj
-E<gt>B<sortedKnown>([PREPARE [,COMPARE]])
=over 4
Returns all L<known()|Mail::Box::Thread::Manager/
"The Threads"
> threads, in sorted order. By
default
, the threads
will be sorted on timestamp, But a different COMPARE method can be
specified.
=back
$obj
-E<gt>B<thread>(MESSAGE)
=over 4
Returns the thread where this MESSAGE is the start of. However, there
is a possibility that this message is a reply itself.
Usually, all messages which are in reply of this message are dated later
than the specified one. All headers of messages later than this one are
getting parsed first,
for
each
folder in this threads-object.
I<Example:>
my
$threads
=
$mgr
->threads(
folder
=>
$inbox
);
my
$thread
=
$threads
->thread(
$inbox
->message(3));
print
$thread
->string;
=back
$obj
-E<gt>B<threadStart>(MESSAGE)
=over 4
Based on a message, and facts from previously detected threads,
try
to build solid knowledge about the thread where this message is in.
=back
=head2 Internals
$obj
-E<gt>B<createDummy>(MESSAGE-ID)
=over 4
Get a replacement message to be used in threads. Be warned that a
dummy is not a member of any folder, so the program working
with
threads must test
with
L<Mail::Message::isDummy()|Mail::Message/
"The message"
>
before
trying things only
available to real messages.
=back
$obj
-E<gt>B<inThread>(MESSAGE)
=over 4
Collect the thread-information of one message. The `In-Reply-To' and
`Reference' header-fields are processed. If this method is called on
a message whose header was not
read
yet (as usual
for
MH-folders,
for
instance) the reading of that header will be triggered here.
=back
$obj
-E<gt>B<outThread>(MESSAGE)
=over 4
Remove the message from the thread-infrastructure. A message is
replaced by a dummy.
=back
$obj
-E<gt>B<toBeThreaded>(FOLDER, MESSAGES)
=over 4
Include the specified messages in/from the threads managed by
this object,
if
this folder is maintained by this thread-manager.
=back
$obj
-E<gt>B<toBeUnthreaded>(FOLDER, MESSAGES)
=over 4
Remove the specified messages in/from the threads managed by
this object,
if
this folder is maintained by this thread-manager.
=back
=head2 Error handling
$obj
-E<gt>B<AUTOLOAD>
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<addReport>(OBJECT)
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<defaultTrace>([LEVEL]|[LOGLEVEL, TRACELEVEL]|[LEVEL, CALLBACK])
Mail::Box::Thread::Manager-E<gt>B<defaultTrace>([LEVEL]|[LOGLEVEL, TRACELEVEL]|[LEVEL, CALLBACK])
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<errors>
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<
log
>([LEVEL [,STRINGS]])
Mail::Box::Thread::Manager-E<gt>B<
log
>([LEVEL [,STRINGS]])
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<logPriority>(LEVEL)
Mail::Box::Thread::Manager-E<gt>B<logPriority>(LEVEL)
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<logSettings>
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<notImplemented>
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<report>([LEVEL])
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<reportAll>([LEVEL])
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<trace>([LEVEL])
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
$obj
-E<gt>B<warnings>
=over 4
See L<Mail::Reporter/
"Error handling"
>
=back
=head2 Cleanup
$obj
-E<gt>B<DESTROY>
=over 4
See L<Mail::Reporter/
"Cleanup"
>
=back
$obj
-E<gt>B<inGlobalDestruction>
=over 4
See L<Mail::Reporter/
"Cleanup"
>
=back
=head1 DETAILS
This module implements thread-detection on a folder. Messages created
by the better mailers will include C<In-Reply-To> and C<References>
lines, which are used to figure out how messages are related. If you
prefer a better thread detection, they are implementable, but there
may be a serious performance hit (depends on the type of folder used).
=head2 Maintaining threads
A C<Mail::Box::Thread::Manager> object is created by the
L<Mail::Box::Manager|Mail::Box::Manager>, using L<Mail::Box::Manager::threads()|Mail::Box::Manager/
"Manage message threads"
>.
Each object can monitor the thread-relations between messages in one
or more folders. When more than one folder is specified, the messages
are merged
while
reading the threads, although nothing changes in the
folder-structure. Adding and removing folders which have to be maintained
is permitted at any moment, although may be quite costly in performance.
An example of the maintained structure is shown below. The
L<Mail::Box::Manager|Mail::Box::Manager>
has
two
open
folders, and a thread-builder which
monitors them both. The combined folders have two threads, the second
is two long (msg3 is a reply on msg2). Msg2 is in two folders at once.
manager
| \
| `----------- threads
| | |
| thread thread---thread
| | /| /
| | // /
+---- folder1 | // /
| | / // /
| `-----msg1 // /
| `-----msg2-'/ /
| / /
`-----folder2 / /
| / /
`-----msg2 /
`-----msg3------'
=head2 Delayed thread detection
With L<all()|Mail::Box::Thread::Manager/
"The Threads"
> you get the start-messages of
each
thread of this folder.
When that message was not found in the folder (not saved or already
removed), you get a message of the dummy-type. These thread descriptions
are in perfect state: all messages of the folder are included somewhere,
and
each
missing message of the threads (I<holes>) are filled by dummies.
However, to be able to detect all threads it is required to have the
headers of all messages, which is very slow
for
some types of folders,
especially MH and IMAP folders.
For interactive mail-readers, it is preferred to detect threads only
on messages which are in the viewport of the user. This may be sloppy
in some situations, but everything is preferable over reading an MH
mailbox
with
10k e-mails to
read
only the see most recent messages.
In this object, we take special care not to cause unnecessary parsing
(loading) of messages. Threads will only be detected on command, and
by
default
only the message headers are used.
The following reports the L<Mail::Box::Thread::Node|Mail::Box::Thread::Node> which is
related to a message:
my
$thread
=
$message
->thread;
When the message was not put in a thread yet, it is done now. But, more
work is done to
return
the best thread. Based on various parameters,
which where specified
when
the folder was created, the method walks
through the folder to fill the holes which are in this thread.
Walking from back to front (recently arrived messages are usually in the back
of the folder), message
after
message are triggered to be included in their
thread. At a certain moment, the whole thread of the requested method
is found, a certain maximum number of messages was tried, but that
didn't help (search window bound reached), or the messages within the
folder are getting too old. Then the search to complete the thread will
end, although more messages of them might have been in the folder: we
don't scan the whole folder
for
performance reasons.
Finally,
for
each
message where the head is known,
for
instance
for
all messages in mbox-folders, the correct thread is determined
immediately. Also, all messages where the head get loaded later, are
automatically included.
=head1 DIAGNOSTICS
I<Error:> Package
$package
does not implement
$method
.
Fatal error: the specific
package
(or one of its superclasses) does not
implement this method where it should. This message means that some other
related classes
do
implement this method however the class at hand does
not. Probably you should investigate this and probably inform the author
of the
package
.
=head1 SEE ALSO
This module is part of Mail-Box distribution version 2.072,
=head1 LICENSE
Copyrights 2001-2007 by Mark Overmeer. For other contributors see ChangeLog.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.