NAME

Text::MeCab - Alternate Interface To libmecab

SYNOPSIS

use Text::MeCab;
my $mecab = Text::MeCab->new({
  rcfile             => $rcfile,
  dicdir             => $dicdir,
  userdic            => $userdic,
  lattice_level      => $lattice_level,
  all_morphs         => $all_morphs,
  output_format_type => $output_format_type,
  partial            => $partial,
  node_format        => $node_format,
  unk_format         => $unk_format,
  bos_format         => $bos_format,
  eos_format         => $eos_format,
  input_buffer_size  => $input_buffer_soap,
  allocate_sentence  => $allocate_sentence,
  nbest              => $nbest,
  theta              => $theta,
});

for (my $node = $mecab->parse($text); $node; $node = $node->next) {
   # See perdoc for Text::MeCab::Node for list of methods
   print $node->surface, "\n";
}

# use constants
use Text::MeCab qw(:all);
use Text::MeCab qw(MECAB_NODE_NODE);

# want to use a command line arguments?
my $mecab = Text::MeCab->new("--userdic=/foo/bar/baz", "-P");

# check what mecab version we compiled against?
print "Compiled with ", &Text::MeCab::MECAB_VERSION, "\n";

DESCRIPTION

libmecab (http://mecab.sourceforge.ne.jp) already has a perl interface built with it, so why a new module? I just feel that while a subtle difference, making the perl interface through a tied hash is just... weird.

So Text::MeCab gives you a more natural, Perl-ish way to access libmecab!

METHODS

new HASHREF | LIST

Creates a new Text::MeCab instance.

You can either specify a hashref and use named parameters, or you can use the exact command line arguments that the mecab command accepts.

Below is the list of accepted named options. See the man page for mecab for details about each option.

rcfile
dicdir
lattice_level
all_morphs
output_format_type
partial
node_format
unk_format
bos_format
eos_format
input_buffer_size
allocate_sentence
nbest
theta

parse SCALAR

Parses the given text via mecab, and returns a Text::MeCab::Node object.

NOTES ABOUT PARSED STRUCTURE

Please note that Text::MeCab::parse() creates Text::MeCab::Node objects that are detatched from libmecab Tagger. This is to allow these Perl-ish idioms:

my $node;
{
   my $mecab = Text::MeCab->new;
   $node = $mecab->parse($text);
   # $mecab goes out of scope
}

for(; $node; $node = $node->next) {
   print $node->surface, "\n";
}

and,

my $mecab  = Text::MeCab->new;
my $node_A = $mecab->parse($text_A);
my $node_B = $mecab->parse($text_B);

If we are to use the mecab nodes directly we would have to carefully control the scope between the mecab tagger object and the nodes. Since this is perl, I chose maniplexity over efficiency. Let me know if there are problems with this approach.

AUTHOR

To install Text::MeCab, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Text::MeCab

CPAN shell

perl -MCPAN -e shell
install Text::MeCab

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)