NAME
csvgrep - search for patterns in a CSV and display results in a table
SYNOPSIS
csvgrep <pattern> <file>
csvgrep -d <directory> <pattern>
DESCRIPTION
csvgrep is a script that lets you look for a pattern in a CSV file, and then displays the results in a text table. We assume that the first line in the CSV is a header row.
The simplest usage is to look for a word in a CSV:
% csvgrep Murakami books.csv
+-------------------+-----------------+-------+------+
| Book | Author | Pages | Date |
+-------------------+-----------------+-------+------+
| Norwegian Wood | Haruki Murakami | 400 | 1987 |
| Men without Women | Haruki Murakami | 228 | 2017 |
+-------------------+-----------------+-------+------+
As with regular grep, you can use the -i switch to make it case-insensitive:
% csvgrep -i wood books.csv
+-----------------------+-----------------+-------+------+
| Book | Author | Pages | Date |
+-----------------------+-----------------+-------+------+
| Norwegian Wood | Haruki Murakami | 400 | 1987 |
| A Walk in the Woods | Bill Bryson | 276 | 1997 |
| Death Walks the Woods | Cyril Hare | 222 | 1954 |
+-----------------------+-----------------+-------+------+
You can specify a subset of the columns to display with the -c option, which takes a comma-separated list of column numbers:
% csvgrep -c 0,1,3 -i mary books.csv
+--------------+--------------+------+
| Book | Author | Date |
+--------------+--------------+------+
| Mary Poppins | PL Travers | 1934 |
| Frankenstein | Mary Shelley | 1818 |
+--------------+--------------+------+
You can also use the title of columns with the -c option:
% csvgrep -c book,date -i mary books.csv
+--------------+------+
| Book | Date |
+--------------+------+
| Mary Poppins | 1934 |
| Frankenstein | 1818 |
+--------------+------+
By default the pattern will be matched against the whole line, but you can use --match-column or -mc to specify that the pattern should only be matched against a specific column:
% csvgrep -mc 0 -c 0,1,3 -i mary books.csv
+--------------+--------------+------+
| Book | Author | Date |
+--------------+--------------+------+
| Mary Poppins | PL Travers | 1934 |
+--------------+--------------+------+
The number of the match column refers to the numbering of the full set of columns, regardless of whether you've used the -c option. This means you can match against a column that you're not displaying.
You can also use the column header with the -mc option:
% csvgrep -mc author -i mary books.csv
+--------------+--------------+-------+------+
| Book | Author | Pages | Date |
+--------------+--------------+-------+------+
| Frankenstein | Mary Shelley | 280 | 1818 |
+--------------+--------------+-------+------+
The pattern can be a Perl regexp, but you'll probably need to quote it from your shell:
% csvgrep -i 'walk.*wood' books.csv
+-----------------------+-------------+-------+------+
| Book | Author | Pages | Date |
+-----------------------+-------------+-------+------+
| A Walk in the Woods | Bill Bryson | 276 | 1997 |
| Death Walks the Woods | Cyril Hare | 222 | 1954 |
+-----------------------+-------------+-------+------+
At work we have a number of situations where we have a directory that contains multiple versions of a particular CSV file, for example with a feed from a customer. With the -d option, csvgrep will look at the most recent file in the specified directory, only considering files with a .csv
or .tsv
extension:
% csvgrep -d /usr/local/feeds/users -i smith
If you want to look at 2 files back, you can use the --back 2
option, or the shorthand version, -2
:
% csvgrep -d /usr/local/feeds/users -2 -i smith
I have various aliases defined, like this:
alias tg="csvgrep -d .../file.csv -c 0,1,2 -i"
So then I can just run:
tg smith
This is a script I've used internally, with features being added as I wanted them. Let me know if you've ideas for additional features, or send me a pull request.
Tab-Separated Values
TSV files are pretty common; they use a tab character instead of a comma. If the filename ends with .tsv
rather than .csv
, we'll set the field separator to be a tab character:
% csvgrep -i norwegian ~/books.tsv
+----------------+-----------------+-------+------+
| Book | Author | Pages | Date |
+----------------+-----------------+-------+------+
| Norwegian Wood | Haruki Murakami | 400 | 1987 |
+----------------+-----------------+-------+------+
This also applies to the -d option.
OPTIONS
- -c <column-spec>
-
A comma-separated list of the columns you want displayed, with the first column being 0.
- -d <directory-path>
-
Search the most recently modified
.csv
or.tsv
file in the specified directory, and grep thar. - --back <N> | -<N>
-
Go N back in the list of files, when using the
-d
option. - -h
-
Display short help message.
- -i
-
Case-insensitive grep.
- -mc <column-number>
-
Only search the specified column, which can be specified with the column's name or index (starting at 0).
- -t
-
Use TAB as the field separator. This will be picked automatically for files with a
.tsv
extension.
REPOSITORY
https://github.com/neilb/csvgrep
AUTHOR
Neil Bowers <neilb@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2017 by Neil Bowers <neilb@cpan.org>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.