NAME

fsql - Perform SQL queries against files in CSV/TSV/LTSV/JSON/YAML formats

VERSION

This document describes version 0.230 of fsql (from Perl distribution App-fsql), released on 2019-01-22.

SYNOPSIS

fsql [OPTIONS] [ <QUERY> | --show-schema|-s ]

DESCRIPTION

fsql lets you perform SQL queries against one or several "flat" files of various formats. Each file will be regarded as a SQL table. By using SQL queries, you can do various calculations or manipulations that are otherwise hard/cumbersome to do with traditional text-manipulating Unix commands like cut, sort, head, tail, uniq, and so on. Particularly: data grouping, joining, or filtering with SQL expressions and functions.

As a bonus, you can also modify data (currently CSV only) via SQL INSERT or DELETE commands.

The query result will then be printed out, in one of several available formats.

The magic of all this is performed by DBD::CSV and SQL::Statement.

To use fsql, you must at least specify one file/table (with --add or one of the --add-TYPE options). If none of those options are specified, a table is assumed in STDIN with name stdin and the format will be detected. You must also specify a SQL query.

EXAMPLES

Filter CSV (table from stdin is aptly named so):

% prog-that-produces-csv | fsql 'SELECT id,name FROM stdin WHERE id <= 1000' > final.csv

Pick output format, produce array of hashes instead of the default array of arrays:

% fsql -a ~/book.pl 'SELECT title,name FROM book WHERE year >= 2010' --aoh -f json

You can perform joins, of course:

% fsql -a t.json -a 2.csv:t2 'SELECT * FROM t1 LEFT JOIN t2 ON t1.uid=t2.id'

Show schema:

% fsql -a table1.json -a 2.csv:table2 -s

Insert row to CSV:

% fsql -a file.csv 'INSERT INTO file VALUES (1, 2, 3)'

Delete rows from CSV:

% fsql -a file.csv 'DELETE FROM file WHERE c1 < 10'

Input is CSV or TSV without header row (columns will be named 'col0', 'col1', and so on):

% fsql -a file.csv:header=0:table=t 'SELECT col1,col3 FROM t WHERE col0 <= 1000' > final.csv

EXIT CODES

0 on success.

255 on I/O or SQL error.

99 on command-line options or input data error.

OPTIONS

  • --add=FILENAME[:TABLENAME|OPT1=VAL1:...], -a

    Add a table from a file. Type will be detected from filename extension (and some heuristics, if there is no file extension or extension is unrecognized). Die if type cannot be detected.

    Sometimes the detection will miss. Alternatively, you can use one of the --add-TYPE options to add a specific table type.

    If TABLENAME is not specified, it will be taken from FILENAME (e.g. with filename foo-bar.csv, table name will be foo_bar). Will croak if duplicate table name is detected. Table name must match regex /\A[A-Za-z_][A-Za-z_0-9]*\z/.

    FILENAME can be - to mean the standard input (the default table name will be stdin).

    Alternatively, you can specify table name as well as other option using the OPT1=VAL1:... syntax. Known options include:

    + table=TABLENAME

    Specify table name.

    + header=BOOL

    Default is 1. If set to 0, specifies that CSV/TSV file does not contain header row that contains column names. Column names will be col0, col1, col2, and so on. Has no effect on file types other than CSV/TSV.

  • --add-csv=FILENAME[:TABLENAME|OPT1=VAL1:...]

    Add a table from a CSV file.

  • --add-tsv=FILENAME[:TABLENAME|OPT1=VAL1:...]

    Like --add-csv, but will load file as TSV (tab-separated value).

  • --add-ltsv=FILENAME[:TABLENAME|OPT1=VAL1:...]

    Like --add-csv, but will load file as LTSV (labeled tab separated value, see Text::LTSV). Names of columns will be taken from the first row.

  • --add-json=FILENAME[:TABLENAME|OPT1=VAL1:...]

    Like --add-csv, but will load file as JSON.

    Data can be array, or array of arrays, or array of hashes, or an enveloped response (see Rinci::function), so it is suitable to accept piped output of Perinci::CmdLine-based programs.

  • --add-yaml=FILENAME[:TABLENAME|OPT1=VAL1:...]

    Like --add-json, but will load file as YAML.

  • --add-perl=FILENAME[:TABLENAME|OPT1=VAL1:...]

    Like --add-json, but will load file as Perl.

  • --load-function=NAME, -F

    Load a SQL function. This will load Perl module SQL::Statement::Function::ByName::NAME. See CPAN for list of available modules.

    If you use NAME:NEWNAME, you can load a SQL function as another name, for example:

    -F DAYOFYEAR:DOY
  • --define-function=NAME:CODE, -D

    Define a SQL function. You need to specify name as well as perl code that implements it. Perl code will be wrapped in a subroutine, it should expect the function argument in $_[2] (for more details see SQL::Statement::Functions). Example:

    --define-function 'MONTH:$_[2] =~ /^\d{4}-(\d{2})-/ or return undef; $1+0'
  • --aoa

    Return array of array (the default). Only relevant to outputs like perl, json, yaml, text.

  • --aoh

    Return array of hashes instead of the default array of array, where each row is represented as a hash (dictionary/associated array) instead of an array. Only relevant to output formats like perl, json, yaml, text.

    Returning a hash is convenient when you want column name information on each row, but you can't specify the same column twice and order of columns are not guaranteed.

  • --format=FORMAT (default: text), -f

    Set output format.

    The value csv or tsv or ltsv will cause query results to be output as a comma-separated or TAB-separated list or labeled-TAB separated list, respectively. As this isn't very useful for a schema listing, these values will be silently converted to text if --show-schema (-s) is also present.

    The other values perl, json, yaml, and text will be formatted using appropriate Data::Format::Pretty formatter.

    The default value is the most used table format. So if your tables are mostly CSV, fsql will also output CSV by default.

  • --show-schema, -s

    Instead of running a query, show schema instead. This is useful for debugging.

FAQ

What SQL dialect is supported? Why is SQL feature "X" not supported?

See SQL::Statement::Syntax for the range of supported SQL syntax. In short, you can do select with several kinds of joins, almost/around 100 builtin functions (with additional functions available from Perl modules, see next FAQ entry), etc.

Also, sometimes if there is SQL parsing error, the error message is not immediately obvious and can be a bit confusing. That is usually a parsing limitation/bug within SQL::Statement.

How do I define more SQL functions? Why aren't there more date/time/X functions?

SQL::Statement allows loading Perl functions as SQL functions. There are several CPAN modules (see the SQL::Statement::Function::ByName:: namespace) which nicely package them so you can load them from fsql simply by using the -F option, e.g. to load the YEAR() function:

% fsql -F YEAR --add-csv sometable.csv 'SELECT c1, YEAR(c2) FROM sometable ...'

ENVIRONMENT

DEBUG => bool

If set to true, print debugging messages.

DEBUG_KEEP_TEMPDIR => bool

If set to true, will not cleanup tempdir.

HOMEPAGE

Please visit the project's homepage at https://metacpan.org/release/App-fsql.

SOURCE

Source repository is at https://github.com/perlancar/perl-App-fsql.

BUGS

Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=App-fsql

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

AUTHOR

perlancar <perlancar@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2019, 2016, 2015, 2014 by perlancar@cpan.org.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.