NAME
count - Counting utility for a file consisting of the fixed number of fields like CSV
VERSION
version v0.0.2
SYNOPSIS
count -h
count --help
count [-g
|--group
<columns>] [-c
|--count
] [-s
|--sum
<columns>] [--min
<columns>] [--max
<columns>] [--avg
|--ave
<columns>] [-m
|--map
<map>] [-M
|--map-file
<filename>] [-t
|--delimiter
<delimiter>] files
...
# show brief instruction
count -h
# show POD
count --help
# count the number of records grouping by the column 1 and 2
# The column number is 1-origin
count -g 1,2 file
# count the sum of the column 3 grouping by the column 1 and 2
# field delimiter is ','
count -g 1 -g 2 -s 3 -t ',' file
# Ouput min,max,average of the column 2 and the column 3 grouping by the column 1
count -g 1 --min 2 --max 2 --avg 2 --min 3 --max 3 --avg 3
DESCRIPTION
I has written a oneliner like the following repeatedly and repeatedly, to make some statistics.
perl -e 'while(<>) { @t = split /\t/; ++$c{$t[0]}; } foreach my $k (keys %c) { print "$k,$c{$k}\n" }'
Yes, we can write as the following making use of command line option.
perl -an -F "\t" -e '++$c{$F[0]} END { foreach my $k (keys %c) { print "$k,$c{$k}\n" }'
This is still verbose in contrast with doing. By this script, you can write as the following. Please NOTE that the number is 1-origin.
count -g 1 -t "\t"
Conforming to Unix philosophy, this scirpt does NOT have configurable sort functionality. If you want it, you can use sort
command.
count -g 1 -t "\t" | sort -k n1
OPTIONS
-h
Show brief instruction.
--help
Show this POD.
-g
|--group
<columns>
Specify group columns like GROUP BY in SQL. You can specify multiple times and/or as comma separated numbers.
-c
|--count
Output the number of records. If no other output option is specified, process as if this option is specified.
-s
|--sum
<columns>
Output the sum of the specified column. You can specify multiple times and/or as comma separated numbers.
--min
<columns>
Output the minimum value of the specified column. You can specify multiple times and/or as comma separated numbers.
--max
<columns>
Output the maximum value of the specified column. You can specify multiple times and/or as comma separated numbers.
--avg
|--ave
<columns>
Output the average of the specified column. You can specify multiple times and/or as comma separated numbers.
-m
|--map
<map>
Output mapped value of the specified column by the specified mapping key. Argument is a list of key and column like.
-m 0,class,1,subclass
-M
|--map-file
<filename>
Specify map file used by -m option. The map file is YAML file having the following structure.
<key1>:
<number11>: <value11>
<number12>: <value12>
<key2>:
<number21>: <value21>
<number22>: <value22>
-t
|--delimiter
<delimiter>
Specify field separator character. The character is used by both of input and output.
files
...
Input files. If no files are specified, read from STDIN.
AUTHOR
Yasutaka ATARASHI <yakex@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2013 by Yasutaka ATARASHI.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.