NAME

count - Counting utility for a file consisting of the fixed number of fields like CSV

VERSION

version v0.0.1

SYNOPSIS

count -h

count [-g|--group <columns>] [-c|--count] [-s|--sum <columns>] [--min <columns>] [--max <columns>] [--avg|--ave <columns>] [-m|--map <map>] [-M|--map-file <filename>] [-t|--delimiter <delimiter>] files...

# show POD
count -h

# count the number of records grouping by the column 1 and 2
# The column number is 1-origin
count -g 1,2 file

# count the sum of the column 3 grouping by the column 1 and 2
# field delimiter is ','
count -g 1 -g 2 -s 3 -t ',' file

# Ouput min,max,average of the column 2 and the column 3 grouping by the column 1
count -g 1 --min 2 --max 2 --avg 2 --min 3 --max 3 --avg 3

DESCRIPTION

I has written a oneliner like the following repeatedly and repeatedly, to make some statistics.

perl -e 'while(<>) { @t = split /\t/; ++$c{$t[0]}; } foreach my $k (keys %c) { print "$k,$c{$k}\n"'

Yes, we can write as the following making use of command line option.

perl -an -F "\t" -e '++$c{$F[0]} END { foreach my $k (keys %c) { print "$k,$c{$k}\n" }'

This is still verbose in contrast with doing. By this script, you can write as the following. Please NOTE that the number is 1-origin.

count -g 1 -t "\t"

Conforming to Unix philosophy, this scirpt does NOT have configurable sort functionality. If you want it, you can use sort command.

count -g 1 -t "\t" | sort -k n1

OPTIONS

-g|--group <columns>

Specify group columns like GROUP BY in SQL. You can specify multiple times and/or as comma separated numbers.

-c|--count

Output the number of records. If no other output option is specified, process as if this option is specified.

-s|--sum <columns>

Output the sum of the specified column. You can specify multiple times and/or as comma separated numbers.

--min <columns>

Output the minimum value of the specified column. You can specify multiple times and/or as comma separated numbers.

--max <columns>

Output the maximum value of the specified column. You can specify multiple times and/or as comma separated numbers.

--avg|--ave <columns>

Output the average of the specified column. You can specify multiple times and/or as comma separated numbers.

-m|--map <map>

Output mapped value of the specified column by the specified mapping key. Argument is a list of key and column like.

-m 0,class,1,subclass

-M|--map-file <filename>

Specify map file used by -m option. The map file is YAML file having the following structure.

<key1>:
  <number11>: <value11>
  <number12>: <value12>
<key2>:
  <number21>: <value21>
  <number22>: <value22>

-t|--delimiter <delimiter>

Specify field separator character. The character is used by both of input and output.

files...

Input files. If no files are specified, read from STDIN.

AUTHOR

Yasutaka ATARASHI <yakex@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Yasutaka ATARASHI.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.