NAME

Text::CSV::Unicode - comma-separated values manipulation routines with potentially wide character data

SYNOPSIS

use Text::CSV::Unicode;

$csv = Text::CSV::Unicode->new( { binary => 1 } );

# then use methods from Text::CSV::Base (= Text::CSV 0.01)

$version = Text::CSV::Unicode->version();	# get the module version

$csv = Text::CSV::Unicode->new();	# create a new object

$status = $csv->combine(@columns);	# combine columns into a string
$line = $csv->string();		# get the combined string

$status = $csv->parse($line);	# parse a CSV string into fields
@columns = $csv->fields();		# get the parsed fields

$status = $csv->status();		# get the most recent status
$bad_argument = $csv->error_input();# get the most recent bad argument

DESCRIPTION

Text::CSV::Unicode provides facilities for the composition and decomposition of comma-separated values, based on Text::CSV 0.01. Text::CSV::Unicode allows for input with wide character data.

An instance of the Text::CSV::Unicode class can combine fields into a CSV string and parse a CSV string into fields.

FUNCTIONS

version
$version = Text::CSV::Unicode->version();

This function may be called as a class or an object method. It returns the current module version.

new
$csv = Text::CSV::Unicode->new( [{ binary => 1 }] );

This function may be called as a class or an object method. It returns a reference to a newly created Text::CSV::Unicode object. binary => 0 allows the same ASCII input as Text::CSV and all other input, while binary => 1 allows for all Unicode characters in the input (including \r and \n),

combine
$status = $csv->combine(@columns);

This object function constructs a CSV string from the arguments, returning success or failure. Failure can result from lack of arguments or an argument containing an invalid character. Upon success, string() can be called to retrieve the resultant CSV string. Upon failure, the value returned by string() is undefined and error_input() can be called to retrieve an invalid argument.

Silently accepts undef values in input and treats as an empty string.

string
$line = $csv->string();

This object function returns the input to parse() or the resultant CSV string of combine(), whichever was called more recently.

parse
$status = $csv->parse($line);

This object function decomposes a CSV string into fields, returning success or failure. Failure can result from a lack of argument or the given CSV string is improperly formatted. Upon success, fields() can be called to retrieve the decomposed fields. Upon failure, the value returned by fields() is undefined and error_input() can be called to retrieve the invalid argument.

fields
@columns = $csv->fields();

This object function returns the input to combine() or the resultant decomposed fields of parse(), whichever was called more recently.

status
$status = $csv->status();

This object function returns success (or failure) of combine() or parse(), whichever was called more recently.

error_input
$bad_argument = $csv->error_input();

This object function returns the erroneous argument (if it exists) of combine() or parse(), whichever was called more recently.

SUBROUTINES/METHODS

None

DIAGNOSTICS

None

CONFIGURATION AND ENVIRONMENT

See HASH option to ->new.

DEPENDENCIES

perl 5.8.0

INCOMPATIBILITIES

None

BUGS AND LIMITATIONS

As slow as Text::CSV 0.01.

Cannot change separators and delimiters.

EXAMPLE

    require Text::CSV::Unicode;

    my $csv = Text::CSV::Unicode->new;

    my $column = '';
    my $sample_input_string = '"I said, ""Hi!""",Yes,"",2.34,,"1.09"';
    if ($csv->parse($sample_input_string)) {
	my @field = $csv->fields;
	my $count = 0;
	for $column (@field) {
	    print ++$count, " => ", $column, "\n";
	}
	print "\n";
    }
    else {
	my $err = $csv->error_input;
	print "parse() failed on argument: ", $err, "\n";
    }

    my @sample_input_fields = ( 'You said, "Hello!"',
				5.67,
				'Surely',
				'',
				'3.14159');
    if ($csv->combine(@sample_input_fields)) {
	my $string = $csv->string;
	print $string, "\n";
    }
    else {
	my $err = $csv->error_input;
	print "combine() failed on argument: ", $err, "\n";
    }

CAVEATS

This module is based upon a working definition of CSV format which may not be the most general.

  1. Allowable characters within a CSV field are all unicode characters, with binary => 1; otherwise control characters are not allowed, but the tab character is allowed.

  2. A field within CSV may be surrounded by double-quotes.

  3. A field within CSV must be surrounded by double-quotes to contain a comma.

  4. A field within CSV must be surrounded by double-quotes to contain an embedded double-quote, represented by a pair of consecutive double-quotes.

  5. Line-ending characters are handled as part of the data.

VERSION

0.115

AUTHOR

Robin Barker <rmbarker@cpan.org>

SEE ALSO

Text::CSV 0.01

LICENSE AND COPYRIGHT

Copyright (c) 2007, 2008, 2010, 2011, 2012 Robin Barker. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The documentation of Text::CSV::Unicode methods that are inherited from Text::CSV::Base is taken from Text::CSV 0.01 (with some reformatting) and is Copyright (c) 1997 Alan Citterman.