NAME
freqtable - Print frequency table of lines/words/characters/bytes/numbers
VERSION
This document describes version 0.010 of freqtable (from Perl distribution App-freqtable), released on 2025-08-03.
SYNOPSIS
% freqtable [OPTIONS] < INPUT
Sample input:
% cat input-lines.txt
one
one
two
three
four
five
five
five
six
seven
eight
eight
nine
% cat input-words.txt
one one two three four five five five six seven eight eight nine
% cat input-nums.txt
9.99 cents
9.99 dollars
9 cents
Modes
Display frequency table (by default: lines):
% freqtable input-lines.txt
3 five
2 eight
2 one
1 four
1 nine
1 seven
1 six
1 three
1 two
Display frequency table (words):
% freqtable -w input-words.txt
3 five
2 eight
2 one
1 four
1 nine
1 seven
1 six
1 three
1 two
Display frequency table (characters):
% freqtable -c input-words.txt
12
12 e
7 i
5 n
4 f
4 o
4 t
4 v
3 h
2 g
2 r
2 s
1
1 u
1 w
1 x
Display frequency table (nums):
% freqtable -n input-nums.txt
2 9.99
1 9
Display frequency table (integers):
% freqtable -i input-nums.txt
3 9
Formatting the output line: omitting the frequency (-F option)
Don't display the frequencies:
% freqtable -F input-lines.txt
five
eight
one
four
nine
seven
six
three
two
Formatting the output line: showing the percentages (`--percent`, `-p` option)
The default is to show frequencies as numbers:
% freqtable input-lines.txt
3 five
...
You can display frequencies as percent instead:
% freqtable -p input-lines.txt
23.08% five
...
Specify another `-p` if you want to display frequencies as integers as well as percent:
% freqtable -pp input-lines.txt
3 23.08% five
...
Formatting the output line: custom formatting (`--format` option)
% freqtable --format '%04d: %s' input-lines.txt
0003: five
Filter by rank
Only display the top 3 ranks:
% freqtable input-lines.txt -r -3
% freqtable input-lines.txt -r 1-3
3 five
2 eight
2 one
Sorting
Instead of the default sorting by frequency (descending order), if you specify --sort-sub
(and optionally one or more --sort-arg
) you can sort by the keys using one of Sort::Sub::* subroutines. Examples:
# sort by keys, asciibetically
% freqtable -F input-lines.txt --sort-sub asciibetically
2 eight
3 five
1 four
1 nine
2 one
1 seven
1 six
1 three
1 two
# sort by keys, asciibetically (descending order)
% freqtable -F input-lines.txt --sort-sub 'asciibetically<r>'
1 two
1 three
1 six
1 seven
2 one
1 nine
1 four
3 five
2 eight
# sort by keys, randomly using perl code (essentially, shuffling)
% freqtable -F input-lines.txt --sort-sub 'by_perl_code' --sort-arg 'code=int(rand()*3)-1'
3 five
1 three
2 eight
1 seven
2 one
1 six
1 nine
1 two
1 four
Running table (`--output-every` option)
If you have streaming input, you can instruct `freqtable` to print the result periodically after a number of input lines/words/characters/bytes. You can also instruct to clear the terminal screen before every output (`--clear-before-output`).
% perl -MArray::Sample::WeightedRandom=sample_weighted_random_with_replacement \
-E'say sample_weighted_random_with_replacement(
[ ["a", 1], ["b", 2], ["c", 3], ["d",5] ], 1) while 1' | \
freqtable --output-every 10000 --clear --percent
Sample output:
45.43% d
27.28% c
18.20% b
9.10% a
DESCRIPTION
This utility counts the occurences of lines (or words/characters) in the input then display each unique lines along with their number of occurrences. You can also instruct it to only show lines that have a specified number of occurrences.
You can use the following Unix command to count occurences of lines:
% sort input-lines.txt | uniq -c | sort -nr
and with a bit more work you can also use a combination of existing Unix commands to count occurrences of words/characters, as well as filter items that have a specified number of occurrences; freqtable basically offers convenience.
EXIT CODES
0 on success.
255 on I/O error.
99 on command-line options error.
OPTIONS
--bytes, -c
--chars, -m
--words, -w
--lines, -l
--number, -n
Treat each line as a number. A line like this:
9.99 cents
will be regarded as:
9.99
--integer, -i
Treat each line as an integer. A line like this:
9.99 cents
will be regarded as:
9
--ignore-case, -f
--no-print-freq, -F
Will not print the frequencies.
--print-total, -t
Print the total line at the bottom.
--no-print-total, -T
Do not print the total line at the bottom (the default).
--rank=s, -r
Filter by rank. There are several ways you can do this:
-N
to only display the top N ranks.N
to only display the N'th rank.M-N
to only display the M'th to N'th rank.M-
to only display the M'th rank and lower items.--sort-sub=s
This will cause
freqtable
to sort by key name instead of by frequencies. You pass this option to specify a Sort::Sub routine, which is the name of aSort::Sub::*
module without theSort::Sub::
prefix, e.g.asciibetically
. The name can optionally be followed by<i>
, or<r>
, or<ir>
to mean case-insensitive sorting, reverse order, and reverse order case-insensitive sorting, respectively. When you use one of these suffixes on the command-line, remember to quote since<
and>
can be intereprted by shell.Examples:
asciibetically asciibetically<i> by_length<r>
--sort-arg=ARGNAME=ARGVALUE
Pass argument(s) to the sort subroutine. Can be specified multiple times, once for every argument.
-a
Shortcut for
--sort=asciibetically
.--percent, -p
Show frequencies as percentages instead of integers. If you specify this option one more time, will show frequencies as integers as well as percentages.
--format=s
Format frequency line using `sprintf()` template. `freqtable` will supply these arguments after the template: frequency integer, item string, and frequency as percent. For example:
%04d: %s # sample output: 0004: five
If you want to display the item first, you can use something like:
%2$-12s: %d # sample output: five : 3 eight : 2
--output-every=i
If set, then after every specified number of input data (bytes/characters/words/lines), will output the "running" (current) frequency table.
--clear-before-output
Emit ANSI escape codes "\033[2J" before each output to clear the screen.
FAQ
HOMEPAGE
Please visit the project's homepage at https://metacpan.org/release/App-freqtable.
SOURCE
Source repository is at https://github.com/perlancar/perl-App-freqtable.
SEE ALSO
Unix commands wc, sort, uniq
wordstat from App::wordstat
csv-freqtable from App::CSVUtils
AUTHOR
perlancar <perlancar@cpan.org>
CONTRIBUTING
To contribute, you can send patches by email/via RT, or send pull requests on GitHub.
Most of the time, you don't need to build the distribution yourself. You can simply modify the code, then test via:
% prove -l
If you want to build the distribution (e.g. to try to install it locally on your system), you can install Dist::Zilla, Dist::Zilla::PluginBundle::Author::PERLANCAR, Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two other Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps required beyond that are considered a bug and can be reported to me.
COPYRIGHT AND LICENSE
This software is copyright (c) 2025 by perlancar <perlancar@cpan.org>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
BUGS
Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=App-freqtable
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.