NAME
count - Counting utility for a file consisting of the fixed number of fields like CSV
VERSION
version v0.0.2
SYNOPSIS
count -h
count --help
count [-g|--group <columns>] [-c|--count] [-s|--sum <columns>] [--min <columns>] [--max <columns>] [--avg|--ave <columns>] [-m|--map <map>] [-M|--map-file <filename>] [-t|--delimiter <delimiter>] files...
# show brief instruction
count -h
# show POD
count --help
# count the number of records grouping by the column 1 and 2
# The column number is 1-origin
count -g 1,2 file
# count the sum of the column 3 grouping by the column 1 and 2
# field delimiter is ','
count -g 1 -g 2 -s 3 -t ',' file
# Ouput min,max,average of the column 2 and the column 3 grouping by the column 1
count -g 1 --min 2 --max 2 --avg 2 --min 3 --max 3 --avg 3
DESCRIPTION
I has written a oneliner like the following repeatedly and repeatedly, to make some statistics.
perl -e 'while(<>) { @t = split /\t/; ++$c{$t[0]}; } foreach my $k (keys %c) { print "$k,$c{$k}\n" }'
Yes, we can write as the following making use of command line option.
perl -an -F "\t" -e '++$c{$F[0]} END { foreach my $k (keys %c) { print "$k,$c{$k}\n" }'
This is still verbose in contrast with doing. By this script, you can write as the following. Please NOTE that the number is 1-origin.
count -g 1 -t "\t"
Conforming to Unix philosophy, this scirpt does NOT have configurable sort functionality. If you want it, you can use sort command.
count -g 1 -t "\t" | sort -k n1
OPTIONS
-h
Show brief instruction.
--help
Show this POD.
-g|--group <columns>
Specify group columns like GROUP BY in SQL. You can specify multiple times and/or as comma separated numbers.
-c|--count
Output the number of records. If no other output option is specified, process as if this option is specified.
-s|--sum <columns>
Output the sum of the specified column. You can specify multiple times and/or as comma separated numbers.
--min <columns>
Output the minimum value of the specified column. You can specify multiple times and/or as comma separated numbers.
--max <columns>
Output the maximum value of the specified column. You can specify multiple times and/or as comma separated numbers.
--avg|--ave <columns>
Output the average of the specified column. You can specify multiple times and/or as comma separated numbers.
-m|--map <map>
Output mapped value of the specified column by the specified mapping key. Argument is a list of key and column like.
-m 0,class,1,subclass
-M|--map-file <filename>
Specify map file used by -m option. The map file is YAML file having the following structure.
<key1>:
<number11>: <value11>
<number12>: <value12>
<key2>:
<number21>: <value21>
<number22>: <value22>
-t|--delimiter <delimiter>
Specify field separator character. The character is used by both of input and output.
files...
Input files. If no files are specified, read from STDIN.
AUTHOR
Yasutaka ATARASHI <yakex@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2013 by Yasutaka ATARASHI.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.