NAME
Lingua::RU::Antimat - Perl Module for removal Russian slang from chat, guestbooks, etc.
SYNOPSIS
use POSIX qw(locale_h);
use Lingua::RU::Antimat;
use locale;
setlocale(LC_CTYPE,"ru_RU.CP1251");
$dirty_text='text with slang';
$mat= Lingua::RU::Antimat->new;
#load dictionary with additional words
$mat->load_dict('/home/www/badwords');
$mat->set_bip('Sorry!');
$clean_text=$mat->remove_slang($dirty_text);
RUSSIAN DOCUMENTATION
Detailed Russian documentation and tutorial available on http://www.tcen.ru/antimat
DESCRIPTION
This module will remove Russian slang from a string. 'Mat' is Russian name for such bad words and that is why this module is called Antimat.
- $mat=Lingua::RU::Antimat->new($codepage);
-
This method creates a new object and returns it. If new() is called without any arguments, the module will use templates for text in encoding win-1251. If your text in encoding KOI8-R set $codepage equal 'koi8'.
Examples:
$mat=Lingua::RU::Antimat->new; #for text in win-1251
$mat=Lingua::RU::Antimat->new('koi8'); #for text in KOI8-R
- $clean_text=$mat->remove_slang($dirty_text);
-
Method remove_slang takes string and returns string where all bad words replaced on Russian analog 'bip' or string you set in method set_bip which is described later.
- $badwords=$mat->detect_slang($dirty_text);
-
Method detect_slang takes string and returns boolean value. This value equal 1 if there is bad word in the string and 0 if there is no such words in the string.
- $mat->set_bip($bip);
-
Set the string (usually word) which will replace bad words in method remove_slang.
Examples:
$mat->set_bip(''); #let strip out slang
$mat->set_bip('I am sorry!'); #long but also correct
- $mat->load_dict($file);
-
This method loads dictionary with additional bad words. Each string in the dictionary should be a word or regular expression. $file could be relative or absolute path to the dictionary.
SEE ALSO
Detailed Russian documentation on http://www.tcen.ru/antimat
perllocale manpage
CREDITS
Andrey Skorohod, marlenus@marlenus.com for his bug reports. Vladimir Zhdanov, vovka@lg.kamaz.net for his bug report. Andrey Sharapov, Sharapov@tut.by for his suggestions. Yury Voloshin, xtc@norilsk.net for his bug report and suggestions.
Thanks!
AUTHOR
Ilya Soldatkin, arc@tcen.ru
Drop me a line if you deploy this module on your site. Think about this as a small contribution to my efforts for writing and supporting this module. I can not improve this module if I will know that no one uses it.
COPYRIGHT
Copyright 2001-2003 Ilya Soldatkin. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.