NAME

Catmandu::Importer::Breaker - Package that imports the Breaker format

SYNOPSIS

# Using the default breaker
$ catmandu convert JSON to Breaker < data.json

# Using a OAI_DC breaker 
$ catmandu convert OAI --url http://biblio.ugent.be/oai to Breaker --handler oai_dc

# Using a MARCXML breaker
$ catmandu convert MARC to Breaker --handler marc

# Using an XML breaker
$ catmandu convert XML --path book to Brealer --handler xml < t/book.xml > data.breaker

# Find the usage of fields in the XML file above
$ cat data.breaker | cut -f 2 | sort | uniq -c

# Convert the Breaker format by line into JSON
$ catmandu convert Breaker < data.breaker

# Convert the Breaker format by record into JSON
$ catmandu convert Breaker --record 1 < data.breaker

DESCRIPTION

Inspired by the article "Metadata Analysis at the Command-Line" by Mark Phillips in http://journal.code4lib.org/articles/7818 this exporter breaks a metadata records into the Breaker format which can be analyzed further with command line tools.

CONFIGURATION

file

Read input from a local file given by its path. Alternatively a scalar reference can be passed to read from a string.

fh

Read input from an IO::Handle. If not specified, Catmandu::Util::io is used to create the input stream from the file argument or by using STDIN.

encoding

Binmode of the input stream fh. Set to :utf8 by default.

fix

An ARRAY of one or more fixes or file scripts to be applied to imported items.

record

The to a true value to join all record values in an array see BREAKER OUTPUT FORMAT

BREAKER INPUT FORMAT

<record-identifier><tab><metadata-field><tab><metadata-value>

BREAKER OUTPUT FORMAT

The breaker format is parsed into a Hash containing 4 fields:

_id:   the idenifier of the record
field: the full name (plus namespace) of a field
tag:   the name of a field
data:  the content of the field

When the record option is set to a true value than the field,namespace,tag and data will contain an array of all possible values in a record.

SEE ALSO

Catmandu::Exporter::Breaker