NAME
freqtable - Print frequency table of lines/words/characters/bytes/numbers
VERSION
This document describes version 0.008 of freqtable (from Perl distribution App-freqtable), released on 2023-12-28.
SYNOPSIS
% freqtable [OPTIONS] < INPUT
Sample input:
% cat input-lines.txt
one
one
two
three
four
five
five
five
six
seven
eight
eight
nine
% cat input-words.txt
one one two three four five five five six seven eight eight nine
% cat input-nums.txt
9.99 cents
9.99 dollars
9 cents
Modes
Display frequency table (by default: lines):
% freqtable input-lines.txt
3 five
2 eight
2 one
1 four
1 nine
1 seven
1 six
1 three
1 two
Display frequency table (words):
% freqtable -w input-words.txt
3 five
2 eight
2 one
1 four
1 nine
1 seven
1 six
1 three
1 two
Display frequency table (characters):
% freqtable -c input-words.txt
12
12 e
7 i
5 n
4 f
4 o
4 t
4 v
3 h
2 g
2 r
2 s
1
1 u
1 w
1 x
Display frequency table (nums):
% freqtable -n input-nums.txt
2 9.99
1 9
Display frequency table (integers):
% freqtable -i input-nums.txt
3 9
-F option
Don't display the frequencies:
% freqtable -F input-lines.txt
five
eight
one
four
nine
seven
six
three
two
Filter by frequencies
Only display lines that appear three times:
% freqtable -F input-lines.txt --freq 3
3 five
Only display lines that appear more than once:
% freqtable -F input-lines.txt --freq 2-
3 five
2 eight
2 one
Only display lines that appear less than three times:
% freqtable -F input-lines.txt --freq -2
2 eight
2 one
1 four
1 nine
1 seven
1 six
1 three
1 two
Sorting
Instead of the default sorting by frequency (descending order), if you specify --sort-sub
(and optionally one or more --sort-arg
) you can sort by the keys using one of Sort::Sub::* subroutines. Examples:
# sort by keys, asciibetically
% freqtable -F input-lines.txt --sort-sub asciibetically
2 eight
3 five
1 four
1 nine
2 one
1 seven
1 six
1 three
1 two
# sort by keys, asciibetically (descending order)
% freqtable -F input-lines.txt --sort-sub 'asciibetically<r>'
1 two
1 three
1 six
1 seven
2 one
1 nine
1 four
3 five
2 eight
# sort by keys, randomly using perl code (essentially, shuffling)
% freqtable -F input-lines.txt --sort-sub 'by_perl_code' --sort-arg 'code=int(rand()*3)-1'
3 five
1 three
2 eight
1 seven
2 one
1 six
1 nine
1 two
1 four
DESCRIPTION
This utility counts the occurences of lines (or words/characters) in the input then display each unique lines along with their number of occurrences. You can also instruct it to only show lines that have a specified number of occurrences.
You can use the following Unix command to count occurences of lines:
% sort input-lines.txt | uniq -c | sort -nr
and with a bit more work you can also use a combination of existing Unix commands to count occurrences of words/characters, as well as filter items that have a specified number of occurrences; freqtable basically offers convenience.
EXIT CODES
0 on success.
255 on I/O error.
99 on command-line options error.
OPTIONS
--bytes, -c
--chars, -m
--words, -w
--lines, -l
--number, -n
Treat each line as a number. A line like this:
9.99 cents
will be regarded as:
9.99
--integer, -i
Treat each line as an integer. A line like this:
9.99 cents
will be regarded as:
9
--ignore-case, -f
--no-print-freq, -F
Will not print the frequencies.
--freq=s
Filter by frequencies.
N
(e.g. --freq 5) means only display items that occur N times.M-N
(e.g. --freq 5-10) means only display items that occur between M and N times.M-
(e.g. --freq 5-) means only display items that occur at least M times.-N
(e.g. --freq -10) means only display items that occur at most N times.--sort-sub=s
This will cause
freqtable
to sort by key name instead of by frequencies. You pass this option to specify a Sort::Sub routine, which is the name of aSort::Sub::*
module without theSort::Sub::
prefix, e.g.asciibetically
. The name can optionally be followed by<i>
, or<r>
, or<ir>
to mean case-insensitive sorting, reverse order, and reverse order case-insensitive sorting, respectively. When you use one of these suffixes on the command-line, remember to quote since<
and>
can be intereprted by shell.Examples:
asciibetically asciibetically<i> by_length<r>
--sort-arg=ARGNAME=ARGVALUE
Pass argument(s) to the sort subroutine. Can be specified multiple times, once for every argument.
-a
Shortcut for
--sort=asciibetically
.--percent, -p
Show frequencies as percentages.
FAQ
HOMEPAGE
Please visit the project's homepage at https://metacpan.org/release/App-freqtable.
SOURCE
Source repository is at https://github.com/perlancar/perl-App-freqtable.
SEE ALSO
Unix commands wc, sort, uniq
wordstat from App::wordstat
csv-freqtable from App::CSVUtils
AUTHOR
perlancar <perlancar@cpan.org>
CONTRIBUTING
To contribute, you can send patches by email/via RT, or send pull requests on GitHub.
Most of the time, you don't need to build the distribution yourself. You can simply modify the code, then test via:
% prove -l
If you want to build the distribution (e.g. to try to install it locally on your system), you can install Dist::Zilla, Dist::Zilla::PluginBundle::Author::PERLANCAR, Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two other Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps required beyond that are considered a bug and can be reported to me.
COPYRIGHT AND LICENSE
This software is copyright (c) 2023, 2022, 2018 by perlancar <perlancar@cpan.org>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
BUGS
Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=App-freqtable
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.