The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

HTML::Feature - an extractor of feature sentence from HTML

SYNOPSIS

    use strict;
    use HTML::Feature;

    my $f = HTML::Feature->new(
        enc_type => 'utf-8',
        ret_num => 10,
        max_bytes => 5000,
        min_bytes => 1
    );
    my $data = $f->extract( url => 'http://www.perl.com' );

    # print result data

    print $data->{title}, "\n";
    print $data->{description}, "\n";

    for(@{$data->{block}}){
        print $_->{score}, "\n";
        print $_->{contents}, "\n";
    }

DESCRIPTION

This module extracts some feature blocks from an HTML document. I do not adopt general technique such as "morphological analysis" in this module. By simpler statistics processing, this module will extract a feature blocks. So, it may be able to apply it in a language of any country easily.

METHODS

new([options])

a object is made by using the options.

extract(url => $url | string => $string)

return feature blocks (references) with TITLE and DESCRIPTION.

OPTIONS

    # it is possible to set value to the constructor
    my $f = HTML::Feature->new(
        
        $self->{ret_num} = 1; 
        # number of return blocks (default is '1').
        
        $self->{max_bytes} = '5000'; 
        # The upper limit number of bytes of a node to analyze (default is '').
        
        $self->{min_bytes} = '10'; 
        # The bottom limit number (default is '').
        
        $self->{enc_type} = 'euc-jp'; 
        # An arbitrary character code, If there is not appointment in particular, I become the character code which an UTF-8 flag is with (default is '').
   );

SEE ALSO

HTML::TreeBuilder,Statistics::Lite,Encode::Detect

AUTHOR

Takeshi Miki <miki@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2007 Takeshi Miki

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 250:

You forgot a '=back' before '=head1'