=head1 NAME Mail::Box::Thread::Manager - maintain threads within a set of folders =head1 INHERITANCE Mail::Box::Thread::Manager is a Mail::Reporter =head1 SYNOPSIS my $mgr = Mail::Box::Thread::Manager->new; my $folder = $mgr->open(folder => '/tmp/inbox'); my $threads = $mgr->threads(folder => $folder); my $threads = $mgr->threads($folder); # same foreach my $thread ($threads->all) { $thread->print; } $threads->includeFolder($folder); $threads->removeFolder($folder); =head1 DESCRIPTION A (message-)I<thread> is a message with links to messages which followed in reply of that message. And then the messages with replied to the messages, which replied the original message. And so on. Some threads are only one message long (never replied to), some threads are very long. The C<Mail::Box::Thread::Manager> is very powerful. Not only is it able to do a descent job on MH-like folders (makes a trade-off between perfection and speed), it also can maintain threads from messages residing in different opened folders. Both facilities are rare for mail-agents. The manager creates flexible trees with L<Mail::Box::Thread::Node|Mail::Box::Thread::Node> objects. =head1 METHODS =head2 Constructors =over 4 =item Mail::Box::Thread::Manager-E<gt>B<new>(OPTIONS) A C<Mail::Box::Thread::Manager> object is usually created by a L<Mail::Box::Manager|Mail::Box::Manager>. One manager can produce more than one of these objects. One thread manager can combine messages from a set of folders, which may be partially overlapping with other objects of the same type. -Option --Defined in --Default dummy_type Mail::Message::Dummy folder [ ] folders [ ] log Mail::Reporter 'WARNINGS' thread_body <false> thread_type Mail::Box::Thread::Node timespan '3 days' trace Mail::Reporter 'WARNINGS' window 10 =over 2 =item dummy_type => CLASS The type of dummy messages. Dummy messages are used to fill holes in detected threads: referred to by messages found in the folder, but itself not in the folder. =item folder => FOLDER | REF-ARRAY-FOLDERS Specifies which folders are to be covered by the threads. You can specify one or more open folders. When you close a folder, the manager will automatically remove the messages of that folder from your threads. =item folders => FOLDER | REF-ARRAY-FOLDERS Equivalent to the C<folder> option. =item log => LEVEL =item thread_body => BOOLEAN May thread-detection be based on the content of a message? This has a serious performance implication when there are many messages without C<In-Reply-To> and C<References> headers in the folder, because it will cause many messages to be parsed. NOT IMPLEMENTED YET. =item thread_type => CLASS Type of the thread nodes. =item timespan => TIME | 'EVER' Specify how fast threads usually work: the amount of time between an answer and a reply. This is used in combination with the C<window> option to determine when to give-up filling the holes in threads. See Mail::Box::timespan2seconds() for the possibilities for TIME. With 'EVER', the search for messages in a thread will only be limited by the window-size. =item trace => LEVEL =item window => INTEGER|'ALL' The thread-window describes how many messages should be checked at maximum to fill `holes' in threads for folder which use delay-loading of message headers. The constant 'ALL' will cause thread-detection not to stop trying to fill holes, but continue looking until the first message of the folder is reached. Gives the best quality results, but may perform bad. =back example: use Mail::Box::Manager; my $mgr = new Mail::Box::Manager; my $inbox = $mgr->open(folder => $ENV{MAIL}); my $read = $mgr->open(folder => 'Mail/read'); my $threads = $mgr->threads(folders => [$inbox, $read]); # longer alternative for last line: my $threads = $mgr->threads; $threads->includeFolder($inbox); $threads->includeFolder($read); =back =head2 Grouping Folders =over 4 =item $obj-E<gt>B<folders> Returns the folders as managed by this threader. =item $obj-E<gt>B<includeFolder>(FOLDERS) Add one or more folders to the list of folders whose messages are organized in the threads maintained by this object. Duplicated inclusions will not cause any problems. From the folders, the messages which have their header lines parsed (see L<Mail::Box|Mail::Box> about lazy extracting) will be immediately scanned. Messages of which the header is known only later will have to report this (see L<toBeThreaded()|Mail::Box::Thread::Manager/"Internals">). example: $threads->includeFolder($inbox, $draft); =item $obj-E<gt>B<removeFolder>(FOLDERS) Remove one or more folders from the list of folders whose messages are organized in the threads maintained by this object. example: $threads->removeFolder($draft); =back =head2 The Threads =over 4 =item $obj-E<gt>B<all> Returns all messages which start a thread. The list may contain dummy messages and messages which are scheduled for deletion. To be able to return all threads, thread construction on each message is performed first, which may be slow for some folder-types because is will enforce parsing of message-bodies. =item $obj-E<gt>B<known> Returns the list of all messages which are known to be the start of a thread. Threads containing messages which where not read from their folder (like often happens MH-folder messages) are not yet known, and hence will not be returned. The list may contain dummy messages, and messages which are scheduled for deletion. Threads are detected based on explicitly calling L<inThread()|Mail::Box::Thread::Manager/"Internals"> and L<thread()|Mail::Box::Thread::Manager/"The Threads"> with a messages from the folder. Be warned that, each time a message's header is read from the folder, the return of the method can change. =item $obj-E<gt>B<sortedAll>([PREPARE [COMPARE]]) Returns L<all()|Mail::Box::Thread::Manager/"The Threads"> the threads by default, but sorted on timestamp. =item $obj-E<gt>B<sortedKnown>([PREPARE [,COMPARE]]) Returns all L<known()|Mail::Box::Thread::Manager/"The Threads"> threads, in sorted order. By default, the threads will be sorted on timestamp, But a different COMPARE method can be specified. =item $obj-E<gt>B<thread>(MESSAGE) Returns the thread where this MESSAGE is the start of. However, there is a possibility that this message is a reply itself. Usually, all messages which are in reply of this message are dated later than the specified one. All headers of messages later than this one are getting parsed first, for each folder in this threads-object. example: my $threads = $mgr->threads(folder => $inbox); my $thread = $threads->thread($inbox->message(3)); print $thread->string; =item $obj-E<gt>B<threadStart>(MESSAGE) Based on a message, and facts from previously detected threads, try to build solid knowledge about the thread where this message is in. =back =head2 Internals =over 4 =item $obj-E<gt>B<createDummy>(MESSAGE-ID) Get a replacement message to be used in threads. Be warned that a dummy is not a member of any folder, so the program working with threads must test with L<Mail::Message::isDummy()|Mail::Message/"The message"> before trying things only available to real messages. =item $obj-E<gt>B<inThread>(MESSAGE) Collect the thread-information of one message. The `In-Reply-To' and `Reference' header-fields are processed. If this method is called on a message whose header was not read yet (as usual for MH-folders, for instance) the reading of that header will be triggered here. =item $obj-E<gt>B<outThread>(MESSAGE) Remove the message from the thread-infrastructure. A message is replaced by a dummy. =item $obj-E<gt>B<toBeThreaded>(FOLDER, MESSAGES) Include the specified messages in/from the threads managed by this object, if this folder is maintained by this thread-manager. =item $obj-E<gt>B<toBeUnthreaded>(FOLDER, MESSAGES) Remove the specified messages in/from the threads managed by this object, if this folder is maintained by this thread-manager. =back =head2 Error handling =over 4 =item $obj-E<gt>B<AUTOLOAD> See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<addReport>(OBJECT) See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<defaultTrace>([LEVEL]|[LOGLEVEL, TRACELEVEL]|[LEVEL, CALLBACK]) =item Mail::Box::Thread::Manager-E<gt>B<defaultTrace>([LEVEL]|[LOGLEVEL, TRACELEVEL]|[LEVEL, CALLBACK]) See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<errors> See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<log>([LEVEL [,STRINGS]]) =item Mail::Box::Thread::Manager-E<gt>B<log>([LEVEL [,STRINGS]]) See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<logPriority>(LEVEL) =item Mail::Box::Thread::Manager-E<gt>B<logPriority>(LEVEL) See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<logSettings> See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<notImplemented> See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<report>([LEVEL]) See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<reportAll>([LEVEL]) See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<trace>([LEVEL]) See L<Mail::Reporter/"Error handling"> =item $obj-E<gt>B<warnings> See L<Mail::Reporter/"Error handling"> =back =head2 Cleanup =over 4 =item $obj-E<gt>B<DESTROY> See L<Mail::Reporter/"Cleanup"> =item $obj-E<gt>B<inGlobalDestruction> See L<Mail::Reporter/"Cleanup"> =back =head1 DETAILS =head2 Maintaining threads A C<Mail::Box::Thread::Manager> object is created by the L<Mail::Box::Manager|Mail::Box::Manager>, using L<Mail::Box::Manager::threads()|Mail::Box::Manager/"Manage message threads">. Each object can monitor the thread-relations between messages in one or more folders. When more than one folder is specified, the messages are merged while reading the threads, although nothing changes in the folder-structure. Adding and removing folders which have to be maintained is permitted at any moment, although may be quite costly in performance. An example of the maintained structure is shown below. The L<Mail::Box::Manager|Mail::Box::Manager> has two open folders, and a thread-builder which monitors them both. The combined folders have two threads, the second is two long (msg3 is a reply on msg2). Msg2 is in two folders at once. manager | \ | `----------- threads | | | | thread thread---thread | | /| / | | // / +---- folder1 | // / | | / // / | `-----msg1 // / | `-----msg2-'/ / | / / `-----folder2 / / | / / `-----msg2 / `-----msg3------' =head2 Delayed thread detection With L<all()|Mail::Box::Thread::Manager/"The Threads"> you get the start-messages of each thread of this folder. When that message was not found in the folder (not saved or already removed), you get a message of the dummy-type. These thread descriptions are in perfect state: all messages of the folder are included somewhere, and each missing message of the threads (I<holes>) are filled by dummies. However, to be able to detect all threads it is required to have the headers of all messages, which is very slow for some types of folders, especially MH and IMAP folders. For interactive mail-readers, it is preferred to detect threads only on messages which are in the viewport of the user. This may be sloppy in some situations, but everything is preferable over reading an MH mailbox with 10k e-mails to read only the see most recent messages. In this object, we take special care not to cause unnecessary parsing (loading) of messages. Threads will only be detected on command, and by default only the message headers are used. The following reports the L<Mail::Box::Thread::Node|Mail::Box::Thread::Node> which is related to a message: my $thread = $message->thread; When the message was not put in a thread yet, it is done now. But, more work is done to return the best thread. Based on various parameters, which where specified when the folder was created, the method walks through the folder to fill the holes which are in this thread. Walking from back to front (recently arrived messages are usually in the back of the folder), message after message are triggered to be included in their thread. At a certain moment, the whole thread of the requested method is found, a certain maximum number of messages was tried, but that didn't help (search window bound reached), or the messages within the folder are getting too old. Then the search to complete the thread will end, although more messages of them might have been in the folder: we don't scan the whole folder for performance reasons. Finally, for each message where the head is known, for instance for all messages in mbox-folders, the correct thread is determined immediately. Also, all messages where the head get loaded later, are automatically included. This module implements thread-detection on a folder. Messages created by the better mailers will include C<In-Reply-To> and C<References> lines, which are used to figure out how messages are related. If you prefer a better thread detection, they are implementable, but there may be a serious performance hit (depends on the type of folder used). =head1 DIAGNOSTICS =over 4 =item Error: Package $package does not implement $method. Fatal error: the specific package (or one of its superclasses) does not implement this method where it should. This message means that some other related classes do implement this method however the class at hand does not. Probably you should investigate this and probably inform the author of the package. =back =head1 SEE ALSO This module is part of Mail-Box distribution version 2.099, built on July 07, 2011. Website: F<http://perl.overmeer.net/mailbox/> =head1 LICENSE Copyrights 2001-2011 by Mark Overmeer. For other contributors see ChangeLog. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See F<http://www.perl.com/perl/misc/Artistic.html>