NAME

ETL::Pipeline::Input::DelimitedText - Input source for CSV, tab, or pipe delimited files

SYNOPSIS

use ETL::Pipeline;
ETL::Pipeline->new( {
  input   => ['DelimitedText', iname => qr/\.csv$/i],
  mapping => {First => 'Header1', Second => 'Header2'},
  output  => ['UnitTest']
} )->process;

DESCRIPTION

ETL::Pipeline::Input::DelimitedText defines an input source for reading CSV (comma seperated variable), tab delimited, or pipe delimited files. It uses Text::CSV for parsing.

ETL::Pipeline::Input::DelimitedText expects a standard CSV file. A lot of hand built exporters often forget quote marks, use invalid characters, or don't escape the quotes. If you experience trouble with a file, experiment with the options to Text::CSV.

METHODS & ATTRIBUTES

Arguments for "input" in ETL::Pipeline

ETL::Pipeline::Input::DelimitedText consumes the ETL::Pipeline::Input::File and ETL::Pipeline::Input::File::Table roles. It supports all of the attributes from these two.

In addition, ETL::Pipeline::Input::DelimitedText uses the options from Text::CSV. See Text::CSV for a list.

# Pipe delimited, allowing embedded new lines.
$etl->input( 'DelimitedText',
  iname => qr/\.dat$/i,
  sep_char => '|',
  binary => 1
);

skipping

Optional. If you use a code reference for skipping, this input source sends a line of plain text. The text is not parsed into fields. I assume that you're skipping report headers, not formatted data.

If you pass an integer, the input source completely skips over that many lines. It reads and discards the lines without parsing.

Methods

run

This is the main loop. It opens the file, reads records, and closes it when done. This is the place to look if there are problems.

ETL::Pipeline automatically calls this method.

SEE ALSO

ETL::Pipeline, ETL::Pipeline::Input, ETL::Pipeline::Input::File, ETL::Pipeline::Input::File::Table, Text::CSV

AUTHOR

Robert Wohlfarth <robert.j.wohlfarth@vumc.org>

LICENSE

Copyright 2021 (c) Vanderbilt University Medical Center

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.