use strict;
package Mail::Box::MH;
use base 'Mail::Box::Dir';

our $VERSION = 2.011;

use Mail::Box::MH::Index;
use Mail::Box::MH::Message;
use Mail::Box::MH::Labels;

use Carp;
use FileHandle;
use File::Copy;
use File::Spec;
use File::Basename;

=head1 NAME

Mail::Box::MH - handle MH folders

=head1 CLASS HIERARCHY

 Mail::Box::MH
 is a Mail::Box::Dir
 is a Mail::Box
 is a Mail::Reporter

=head1 SYNOPSIS

 use Mail::Box::MH;
 my $folder = new Mail::Box::MH folder => $ENV{MAIL}, ...;

=head1 DESCRIPTION

This documentation describes how MH mailboxes work, and what you
can do with the MH folder object C<Mail::Box::MH>.
Please read C<Mail::Box-Overview> and C<Mail::Box> first.

L<The internal organization and details|/"IMPLEMENTATION"> are found
at the bottom of this manual-page.

=head1 METHOD INDEX

Methods prefixed with an abbreviation are described in
L<Mail::Box> (MB), L<Mail::Reporter> (MR), L<Mail::Box::Dir> (MBD).

The general methods for C<Mail::Box::MH> objects:

   MB AUTOLOAD                          MR log [LEVEL [,STRINGS]]
   MB addMessage  MESSAGE               MB message INDEX [,MESSAGE]
   MB addMessages MESSAGE [, MESS...    MB messageId MESSAGE-ID [,MESS...
   MB allMessageIds                     MB messages
   MB close OPTIONS                     MB modified [BOOLEAN]
   MB create FOLDERNAME [, OPTIONS]     MB name
   MB current [NUMBER|MESSAGE|MES...       new OPTIONS
   MB delete                            MB openSubFolder NAME [,OPTIONS]
  MBD directory                         MR report [LEVEL]
   MR errors                            MR reportAll [LEVEL]
   MB find MESSAGE-ID                   MR trace [LEVEL]
   MB listSubFolders OPTIONS            MR warnings
   MB locker                            MB writeable

The extra methods for extension writers:

   MR AUTOLOAD                          MR notImplemented
   MB DESTROY                           MB organization
   MB appendMessages OPTIONS            MB read OPTIONS
   MB clone OPTIONS                    MBD readAllHeaders
   MB coerce MESSAGE                   MBD readMessageFilenames DIRECTORY
   MB determineBodyType MESSAGE, ...    MB readMessages OPTIONS
  MBD folderToDirectory FOLDERNAM...    MB scanForMessages MESSAGE, ME...
   MB folderdir [DIR]                   MB sort PREPARE, COMPARE, LIST
   MB foundIn [FOLDERNAME], OPTIONS     MB storeMessage MESSAGE
      highestMessageNumber              MB timespan2seconds TIME
   MR inGlobalDestruction               MB toBeThreaded MESSAGES
      index                             MB toBeUnthreaded MESSAGES
      labels                            MB update OPTIONS
   MB lineSeparator [STRING|'CR'|...    MB updateMessages OPTIONS
   MR logPriority LEVEL                 MB write OPTIONS
   MR logSettings                          writeMessages [OPTIONS]

=head1 METHODS

=over 4

=cut

#-------------------------------------------

=item new OPTIONS

Create a new folder.  The are many options which are taken from other
objects.  For some, different options are set.  For MH-specific options
see below, but first the full list.

 OPTION            DEFINED BY         DEFAULT
 access            Mail::Box          'r'
 create            Mail::Box          0
 folder            Mail::Box          $ENV{MAIL}
 folderdir         Mail::Box          <no default>
 head_wrap         Mail::Box          72
 index_filename    Mail::Box::MH      foldername.'/.index'
 keep_dups         Mail::Box          0
 keep_index        Mail::Box::MH      0
 labels_filename   Mail::Box::MH      foldername.'/.mh_sequence'
 extract           Mail::Box          10kB
 lock_type         Mail::Box          'DOTLOCK'
 lock_file         Mail::Box          foldername.'/.lock'
 lock_timeout      Mail::Box          3600    (1 hour)
 lock_wait         Mail::Box          10      (seconds)
 log               Mail::Reporter     'WARNINGS'
 remove_when_empty Mail::Box          1
 save_on_exit      Mail::Box          1
 trace             Mail::Reporter     'WARNINGS'
 trusted           Mail::Box          <depends on folder location>

Only useful to write extension to C<Mail::Box::MH>.  Common users of
folders you will not specify these:

 OPTION            DEFINED BY         DEFAULT
 body_type         Mail::Box::Dir     <see Mail::Box::Dir>
 body_delayed_type Mail::Box          'Mail::Message::Body::Delayed'
 coerce_options    Mail::Box          []
 field_type        Mail::Box          undef
 head_type         Mail::Box          'Mail::Message::Head::Complete'
 head_delayed_type Mail::Box          'Mail::Message::Head::Delayed'
 index             Mail::Box::MH      undef
 index_type        Mail::Box::MH      'Mail::Box::MH::Index'
 labels            Mail::Box::MH      undef
 labels_type       Mail::Box::MH      'Mail::Box::MH::Labels'
 locker            Mail::Box          undef
 multipart_type    Mail::Box          'Mail::Message::Body::Multipart'
 manager           Mail::Box          undef
 message_type      Mail::Box          'Mail::Box::MH::Message'
 realhead_type     Mail::Box          'Mail::Message::Head'

MH specific options:

=over 4

=item * keep_index =E<gt> BOOL

Keep an index file of the specified mailbox, one file per directory.
Using an index file will speed up things considerably, because it avoids
reading all the message files the moment that you open the folder.  When
you open a folder, you can use the index file to retrieve information such
as the subject of each message, instead of having to read possibly
thousands of messages.

=item * index_filename =E<gt> FILENAME

The FILENAME which is used in each directory to store the headers of all
mails. The filename shall not contain a directory path. (e.g. Do not use
C</usr/people/jan/.index>, nor C<subdir/.index>, but say C<.index>.)

=item * index =E<gt> OBJECT

You may specify an OBJECT of a type which extends C<Mail::Box::MH::Index>
(at least implements the C<get()> method), as alternative for an index file
reader as created by C<Mail::Box::MH>.

=item * labels_filename =E<gt> FILENAME

In MH-folders, messages can be labeled, for instance based on the
sender or whether it is read or not.  This status is kept in a
file which is usually called C<.mh_sequences>, but that name can
be overruled with this flag.

=item * labels =E<gt> OBJECT

You may specify an OBJECT of a type which extends C<Mail::Box::MH::Labels>
(at least implements the C<get()> method), as alternative for labels file
reader as created by C<Mail::Box::MH>.

=back

=cut

my $default_folder_dir = exists $ENV{HOME} ? "$ENV{HOME}/.mh" : '.';

sub init($)
{   my ($self, $args) = @_;

    $args->{folderdir}     ||= $default_folder_dir;

    $self->SUPER::init($args);

    # About the index

    my $directory            = $self->directory;
    my $folderdir            = $self->folderdir;

    $self->{MBM_keep_index}  = $args->{keep_index} || 0;
    $self->{MBM_index}       = $args->{index};
    $self->{MBM_index_type}  = $args->{index_type} || 'Mail::Box::MH::Index';
    for($args->{index_filename})
    {  $self->{MBM_index_filename}
          = !defined $_ ? File::Spec->catfile($directory, '.index') # default
          : File::Spec->file_name_is_absolute($_) ? $_              # absolute
          :               File::Spec->catfile($directory, $_);      # relative
    }

    # About labels

    $self->{MBM_labels}      = $args->{labels};
    $self->{MBM_labels_type} = $args->{labels_type} || 'Mail::Box::MH::Labels';
    for($args->{labels_filename})
    {   $self->{MBM_labels_filename}
          = !defined $_ ? File::Spec->catfile($directory, '.mh_sequences')
          : File::Spec->file_name_is_absolute($_) ? $_               # absolute
          :               File::Spec->catfile($directory, $_);       # relative
    }

    $self;
}

#-------------------------------------------

sub create($@)
{   my ($class, $name, %args) = @_;
    my $folderdir = $args{folderdir} || $default_folder_dir;
    my $directory = $class->folderToDirectory($name, $folderdir);

    return $class if -d $directory;
    unless(mkdir $directory, 0700)
    {   warn "Cannot create directory $directory: $!\n";
        return;
    }

    $class;
}

#-------------------------------------------

sub listSubFolders(@)
{   my ($class, %args) = @_;
    my $dir;
    if(ref $class)
    {   $dir   = $class->directory;
        $class = ref $class;
    }
    else
    {   my $folder    = $args{folder}    || '=';
        my $folderdir = $args{folderdir} || $default_folder_dir;
        $dir   = $class->folderToDirectory($folder, $folderdir);
    }

    $args{skip_empty} ||= 0;
    $args{check}      ||= 0;

    # Read the directories from the directory, to find all folders
    # stored here.  Some directories have to be removed because they
    # are created by all kinds of programs, but are no folders.

    return () unless -d $dir && opendir DIR, $dir;

    my @dirs = grep { !/^\d+$|^\./ && -d File::Spec->catfile($dir,$_) && -r _ }
                   readdir DIR;

    closedir DIR;

    # Skip empty folders.  If a folder has sub-folders, then it is not
    # empty.
    if($args{skip_empty})
    {    my @not_empty;

         foreach my $subdir (@dirs)
         {   if(-f File::Spec->catfile($dir,$subdir, "1"))
             {   # Fast found: the first message of a filled folder.
                 push @not_empty, $subdir;
                 next;
             }

             opendir DIR, File::Spec->catfile($dir,$subdir) or next;
             my @entities = grep !/^\./, readdir DIR;
             closedir DIR;

             if(grep /^\d+$/, @entities)   # message 1 was not there, but
             {   push @not_empty, $subdir; # other message-numbers exist.
                 next;
             }

             foreach (@entities)
             {   next unless -d File::Spec->catfile($dir,$subdir,$_);
                 push @not_empty, $subdir;
                 last;
             }

         }

         @dirs = @not_empty;
    }

    # Check if the files we want to return are really folders.

    return @dirs unless $args{check};

    grep { $class->foundIn(File::Spec->catfile($dir,$_)) } @dirs;
}

#-------------------------------------------

sub openSubFolder($@)
{   my ($self, $name) = (shift, shift);
    my $dir = $self->directory . '/' . $name;

    unless(-d $dir || mkdir $dir, 0755)
    {   warn "Cannot create subfolder $name for $self: $!\n";
        return;
    }

    $self->openRelatedFolder(@_, folder => $dir);
}

#-------------------------------------------

=back

=head1 METHODS for extension writers

=over 4

=cut

#-------------------------------------------

sub foundIn($@)
{   my $class = shift;
    my $name  = @_ % 2 ? shift : undef;
    my %args  = @_;
    my $folderdir = $args{folderdir} || $default_folder_dir;
    my $directory = $class->folderToDirectory($name, $folderdir);

    return 0 unless -d $directory;
    return 1 if -f File::Spec->catfile($directory, "1");

    # More thorough search required in case some numbered messages
    # disappeared (lost at fsck or copy?)

    return unless opendir DIR, $directory;
    foreach (readdir DIR)
    {   next unless m/^\d+$/;   # Look for filename which is a number.
        closedir DIR;
        return 1;
    }

    closedir DIR;
    0;
}

#-------------------------------------------

=item highestMessageNumber

Returns the highest number which is used in the folder to store a file.  This
method may be called when the folder is read (then this number can be
derived without file-system access), but also when the folder is not
read (yet).

=cut

sub highestMessageNumber()
{   my $self = shift;

    return $self->{MBM_highest_msgnr}
        if exists $self->{MBM_highest_msgnr};

    my $directory    = $self->directory;

    opendir DIR, $directory or return;
    my @messages = sort {$a <=> $b} grep /^\d+$/, readdir DIR;
    closedir DIR;

    $messages[-1];
}

#-------------------------------------------

=item index

Create a index reader/writer object.

=cut

sub index()
{   my $self  = shift;
    return () unless $self->{MBM_keep_index};
    return $self->{MBM_index} if defined $self->{MBM_index};

    $self->{MBM_index} = $self->{MBM_index_type}->new
     ( filename  => $self->{MBM_index_filename}
     , head_wrap => $self->{MB_head_wrap}
     , $self->logSettings
     )

}

#-------------------------------------------

=item labels

Create a label reader/writer object.

=cut

sub labels()
{   my $self   = shift;
    return $self->{MBM_labels} if defined $self->{MBM_labels};

    $self->{MBM_labels} = $self->{MBM_labels_type}->new
      ( filename => $self->{MBM_labels_filename}
      , $self->logSettings
      )
}

#-------------------------------------------

sub readMessageFilenames
{   my ($self, $dirname) = @_;

    opendir DIR, $dirname or return;
    my @msgnrs
       = sort {$a <=> $b}
            grep { /^\d+$/ && -f File::Spec->catfile($dirname,$_) }
               readdir DIR;

    closedir DIR;

    @msgnrs;
}

#-------------------------------------------

sub readMessages(@)
{   my ($self, %args) = @_;

    my $directory = $self->directory;
    return unless -d $directory;

    my $locker = $self->locker;
    $locker->lock or return;

    my @msgnrs = $self->readMessageFilenames($directory);

    my $index  = $self->{MBM_index};
    unless($index)
    {   $index = $self->index;
        $index->read if $index;
    }

    my $labels = $self->{MBM_labels};
    unless($labels)
    {    $labels = $self->labels;
         $labels->read if $labels;
    }

    my @log    = $self->logSettings;
    foreach my $msgnr (@msgnrs)
    {
        my $msgfile = File::Spec->catfile($directory, $msgnr);

        my $head;
        $head       = $index->get($msgfile) if $index;
        $head     ||= $args{head_delayed_type}->new(@log);

        my $message = $args{message_type}->new
         ( head      => $head
         , filename  => $msgfile
         , folder    => $self
         );

        my $labref  = $labels ? $labels->get($msgnr) : ();
        $message->label(seen => 1, $labref ? @$labref : ());

        my $body    = $args{body_delayed_type}->new(@log, message => $message);
        $message->storeBody($body);

        $self->storeMessage($message);
    }

    $self->{MBM_highest_msgnr}  = $msgnrs[-1];
    $self;
}
 
#-------------------------------------------

=item writeMessages [OPTIONS]

Write all messages to the folder-file.

 OPTION            DEFINED BY         DEFAULT
 force             Mail::Box          <true>
 head_wrap         Mail::Box          72
 keep_deleted      Mail::Box          <false>
 renumber          Mail::Box::MH      <true>
 save_deleted      Mail::Box          <false>

MH specific options:

=over 4

=item * renumber =E<gt> BOOL

Permit renumbering of message.  Bij default this is true, but for some
unknown reason, you may be thinking that messages should not be renumbered.

=back

=cut

sub writeMessages($)
{   my ($self, $args) = @_;

    # Write each message.  Two things complicate life:
    #   1 - we may have a huge folder, which should not be on disk twice
    #   2 - we may have to replace a message, but it is unacceptable
    #       to remove the original before we are sure that the new version
    #       is on disk.

    my $writer    = 0;

    my $locker    = $self->locker;
    unless($locker->lock)
    {   $self->log(ERROR => "Couldn't write without lock.");
        return;
    }

    my $renumber  = exists $args->{renumber} ? $args->{renumber} : 1;
    my $directory = $self->directory;
    my @messages  = @{$args->{messages}};

    foreach my $message (@messages)
    {
        my $filename = $message->filename;

        my $newfile;
        if($renumber || !$filename)
        {    $newfile = $directory . '/' . ++$writer;
        }
        else
        {    $newfile = $filename;
             $writer  = basename $filename;
        }

        if(!$filename)
        {   # New message for this folder.  Messages are only
            # added to the back, so shouldn't cause a problem.

            my $new = FileHandle->new($newfile, 'w') or die;
            $message->print($new);
            $new->close;
            $message->filename($newfile);
        }
        elsif($message->modified)
        {   # Write modified messages.
            my $oldtmp   = $filename . '.old';
            move $filename, $oldtmp;

            my $new = FileHandle->new($newfile, 'w') or die;
            $message->print($new);
            $new->close;

            unlink $oldtmp;
            $message->filename($newfile);
        }
        elsif($filename eq $newfile)
        {   # Nothing changed: content nor message-number.
        }
        else
        {   # Unmodified messages, but name changed.
            move $filename, $newfile;
            $message->filename($newfile);
        }
    }

    # Write the labels- and the index-file.

    my $labels = $self->labels;
    $labels->write(@messages) if $labels;

    my $index  = $self->index;
    $index->write(@messages) if $index;

    $locker->unlock;

    # Remove an empty folder.  This is done last, because the code before
    # in this method will have cleared the contents of the directory.

    if(!@messages && $self->{MB_remove_empty})
    {   # If something is still in the directory, this will fail, but I
        # don't mind.
        rmdir $directory;
    }

    $self;
}

#-------------------------------------------

sub appendMessages(@)
{   my $class  = shift;
    my %args   = @_;

    my @messages = exists $args{message} ? $args{message}
                 : exists $args{messages} ? @{$args{messages}}
                 : return ();

    my $self     = $class->new(@_, access => 'a');
    my $directory= $self->directory;
    return unless -d $directory;

    my $locker   = $self->locker;
    unless($locker->lock)
    {   $self->log(ERROR => "Cannot append message after $self without lock.");
        return;
    }

    my $msgnr    = $self->highestMessageNumber +1;

    foreach my $message (@messages)
    {
        my $filename = File::Spec->catfile($directory,$msgnr);

        if(my $new = FileHandle->new($filename, 'w'))
        {   $message->print($new);
            $message->filename($filename);
            $new->close;
        }
        else
        {   $self->log(ERROR =>
                "Unable to write message $msgnr to $filename: $!\n");
        }

        $msgnr++;
    }
 
    my $labels   = $self->labels->append(@messages);

    $locker->unlock;
    $self->close;

    @messages;
}

#-------------------------------------------

=back

=head1 IMPLEMENTATION

=head2 How MH-folders work

MH-type folders use a directory to store the messages of one folder.  Each
message is stored in a separate file.  This seems useful, because changes
in a folder change only a few of these small files, in contrast with
file-based folders where changes in a folder cause rewrites of huge
folder files.

However, MH-based folders perform very bad if you need header information
of all messages.  For instance, if you want to have full knowledge about
all message-threads (see C<Mail::Box::Thread::Manager>) in the folder, it
requires to read all header lines in all message files.  And usually, reading
your messages in threads is desired.

So, each message is written in a separate file.  The filenames are
numbers, which count from C<1>.  Next to these message files, a
directory may contain a file named C<.mh_sequences>, storing labels which
relate to the messages.  Furthermore, a folder-directory may contain
sub-directories, which are seen as sub-folders.

=head2 Labels

User actions on a message are flagged with a label.  When the folder is
opened, these flags are read from the C<.mh_sequences> file.  When the
folder is closed that file gets updated.  C<Status> and C<X-Status> lines
in the message headers -as used by Mbox folders- are only looked at when
new messages are added to the folder.  These lines are only updated when a
MH message has to be written to a folder for some reason.

=head2 This implementation

This implementation supports the C<.mh-sequences> file and sub-folders.
Next to this, considerable effort it made to avoid reading each message-file.
This should boost performance of the C<Mail::Box> module over other
Perl-modules which are able to read folders.

Folder-types which store their messages each in one file, together in
one directory, are bad for performance.  Consider that you want to know
the subjects of all messages, while browser through a folder with your
mail-reading client.  This would cause all message-files to be read.

C<Mail::Box::MH> has two ways to try improve performance.  You can use
an index-file, and use on delay-loading.  The combination performs even
better.  Both are explained in the next sections.

=head2 An index-file

If you specify C<keep_index> as option to the folder creation method
C<new()>, then all header-lines of all messages from the folder which
have been read once, will also be written into one dedicated index-file
(one file per folder).  The default filename is C<.index>

However, index-files are not supported by any other reader which supports
MH (as far as I know).  If you read the folders with such I client, it
will not cause unrecoverable conflicts with this index-file, but at most
be bad for performance.

If you do not (want to) use an index-file, then delay-loading may
save your day.

=head1 SEE ALSO

L<Mail::Box-Overview>

=head1 AUTHOR

Mark Overmeer (F<mailbox@overmeer.net>).
All rights reserved.  This program is free software; you can redistribute
it and/or modify it under the same terms as Perl itself.

=head1 VERSION

This code is beta, version 2.011.

Copyright (c) 2001 Mark Overmeer. All rights reserved.
This program is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.

=cut

1;