NAME
Catmandu::Importer::Breaker - Package that imports the Breaker format
SYNOPSIS
# Using the default breaker
$ catmandu convert JSON to Breaker < data.json
# Using a OAI_DC breaker
$ catmandu convert OAI --url http://biblio.ugent.be/oai to Breaker --handler oai_dc
# Using a MARCXML breaker
$ catmandu convert MARC to Breaker --handler marc
# Using an XML breaker
$ catmandu convert XML --path book to Brealer --handler xml < t/book.xml > data.breaker
# Find the usage of fields in the XML file above
$ cat data.breaker | cut -f 2 | sort | uniq -c
# Convert the Breaker format by line into JSON
$ catmandu convert Breaker < data.breaker
# Convert the Breaker format by record into JSON
$ catmandu convert Breaker --record 1 < data.breaker
DESCRIPTION
Inspired by the article "Metadata Analysis at the Command-Line" by Mark Phillips in http://journal.code4lib.org/articles/7818 this exporter breaks a metadata records into the Breaker format which can be analyzed further with command line tools.
CONFIGURATION
- file
-
Read input from a local file given by its path. Alternatively a scalar reference can be passed to read from a string.
- fh
-
Read input from an IO::Handle. If not specified, Catmandu::Util::io is used to create the input stream from the
file
argument or by using STDIN. - encoding
-
Binmode of the input stream
fh
. Set to:utf8
by default. - fix
-
An ARRAY of one or more fixes or file scripts to be applied to imported items.
- record
-
The to a true value to join all record values in an array see
BREAKER OUTPUT FORMAT
BREAKER INPUT FORMAT
<record-identifier><tab><metadata-field><tab><metadata-value>
BREAKER OUTPUT FORMAT
The breaker format is parsed into a Hash containing 4 fields:
_id: the idenifier of the record
field: the full name (plus namespace) of a field
tag: the name of a field
data: the content of the field
When the record
option is set to a true value than the field,namespace,tag and data will contain an array of all possible values in a record.