NAME
Twitter::Text - Perl implementation of the twitter-text parsing library
SYNOPSIS
use Twitter::Text;
$result = parse_tweet('Hello world こんにちは世界');
print $result->{valid} ? 'valid tweet' : 'invalid tweet';
DESCRIPTION
Twitter::Text is a Perl implementation of the twitter-text parsing library.
WARNING
This library does not implement auto-linking and hit highlighting.
Please refer Implementation status for latest status.
FUNCTIONS
All functions below are exported by default.
Extraction
extract_hashtags
my \@hashtags = extract_hashtags($text);
extract_hashtags_with_indices
my \@hashtags_with_indices = extract_hashtags_with_indices($text, [\%options]);
extract_mentioned_screen_names
my \@screen_names = extract_mentioned_screen_names($text);
extract_mentioned_screen_names_with_indices
my \@screen_names_with_indices = extract_mentioned_screen_names_with_indices($text);
extract_mentions_or_lists_with_indices
my \@mentions_or_lists_with_indices = extract_mentions_or_lists_with_indices($text);
extract_urls
my \@urls = extract_urls($text);
extract_urls_with_indices
my \@urls = extract_urls_with_indices($text, [\%options]);
Validation
parse_tweet
my \%parse_result = parse_tweet($text, [\%options]);
The parse_tweet
function takes a $text
string and optional \%options
parameter and returns a hash reference with following values:
weighted_length
-
The overall length of the tweet with code points weighted per the ranges defined in the configuration file.
permillage
-
Indicates the proportion (per thousand) of the weighted length in comparison to the max weighted length. A value > 1000 indicates input text that is longer than the allowable maximum.
valid
-
Indicates if input text length corresponds to a valid result.
display_range_start
,display_range_end
-
An array of two unicode code point indices identifying the inclusive start and exclusive end of the displayable content of the Tweet.
valid_range_start
,valid_range_end
-
An array of two unicode code point indices identifying the inclusive start and exclusive end of the valid content of the Tweet.
EXAMPLES
use Data::Dumper;
use Twitter::Text;
$result = parse_tweet('Hello world こんにちは世界');
print Dumper($result);
# $VAR1 = {
# 'weighted_length' => 33
# 'permillage' => 117,
# 'valid' => 1,
# 'display_range_start' => 0,
# 'display_range_end' => 32,
# 'valid_range_start' => 0,
# 'valid_range_end' => 32,
# };
is_valid_hashtag
my $valid = is_valid_hashtag($hashtag);
is_valid_list
my $valid = is_valid_list($username_list);
is_valid_url
my $valid = is_valid_url($url, [unicode_domains => 1, require_protocol => 1]);
is_valid_username
my $valid = is_valid_username($username);
SEE ALSO
twitter-text. Implementation of Twitter::Text (this library) is heavily based on Ruby implementation of twitter-text.
https://developer.twitter.com/en/docs/counting-characters
COPYRIGHT & LICENSE
Copyright (C) Twitter, Inc and other contributors
Copyright (C) utgwkk.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
utgwkk <utagawakiki@gmail.com>