NAME
Data::Seek::Concepts - Data::Seek Concepts
VERSION
version 0.06
OVERVIEW
This document contains a simple overview of the strategy and syntax used by Data::Seek to query complex data strictures. The overall idea behind Data::Seek is to flatten/fold the data structure once, then reduce it by applying a series patterns.
FLATTENING
The first phase in the Data::Seek introspection strategy is to flatten the data structure using Hash::Flatten, producing a non-hierarchical data structure where it's keys represent endpoints within the structure.
Encoding
During the processing of flattening a data structure with nested data, the following data structure would be converted into a collection of endpoint/value pairs.
{
'id' => 12345,
'patient' => {
'name' => {
'first' => 'Bob',
'last' => 'Bee'
}
},
'medications' => [{
'aceInhibitors' => [{
'name' => 'lisinopril',
'strength' => '10 mg Tab',
'dose' => '1 tab',
'route' => 'PO',
'sig' => 'daily',
'pillCount' => '#90',
'refills' => 'Refill 3'
}],
'antianginal' => [{
'name' => 'nitroglycerin',
'strength' => '0.4 mg Sublingual Tab',
'dose' => '1 tab',
'route' => 'SL',
'sig' => 'q15min PRN',
'pillCount' => '#30',
'refills' => 'Refill 1'
}],
}]
}
Given the aforementioned data structure, the following would be the resulting flattened structure comprised of endpoint/value pairs.
{
'id' => 12345,
'medications:0.aceInhibitors:0.dose' => '1 tab',
'medications:0.aceInhibitors:0.name' => 'lisinopril',
'medications:0.aceInhibitors:0.pillCount' => '#90',
'medications:0.aceInhibitors:0.refills' => 'Refill 3',
'medications:0.aceInhibitors:0.route' => 'PO',
'medications:0.aceInhibitors:0.sig' => 'daily',
'medications:0.aceInhibitors:0.strength' => '10 mg Tab',
'medications:0.antianginal:0.dose' => '1 tab',
'medications:0.antianginal:0.name' => 'nitroglycerin',
'medications:0.antianginal:0.pillCount' => '#30',
'medications:0.antianginal:0.refills' => 'Refill 1',
'medications:0.antianginal:0.route' => 'SL',
'medications:0.antianginal:0.sig' => 'q15min PRN',
'medications:0.antianginal:0.strength' => '0.4 mg Sublingual Tab',
'patient.name.first' => 'Bob'
'patient.name.last' => 'Bee',
}
This structure provides the endpoint strings which will be matched against using the querying strategy.
QUERYING
The second phase in the Data::Seek introspection strategy is to convert a criterion into a series of regular expressions to be sequentially applied, filtering/reducing the endpoints i.e. the keys of flatten data stricture using Data::Seek::Search, producing a data set of matching nodes or throwing an exception explaining the search failure.
Node Expression
id
patient
medications
The node expression is a criterion, or part of a criterion, which matches against a single node. It is a string which can contain letters, numbers, and/or underscores.
Step Expression
patient.name
patient.name.first
patient.name.last
The step expression is a criterion, or part of a criterion, made up of two or more node expressions separated using the period character, which matches against a nested nodes. It is a string which can contain letters, numbers, and/or underscores, separated using periods.
Index Expression
medications:0
medications:0.antianginal
medications:0.antianginal:0.name
The index expression is a criterion, or part of a criterion, having a node expressions suffixed with a semi-colon followed by a number denoting that it should only match an array which has an index corresponding to the numeric portion of the suffix. It is a string which can contain letters, numbers, and/or underscores, suffixed with a semi-colon followed by a number.
Iterator Expression
medications.@
medications.@.antianginal
medications.@.antianginal.@.name
The iteration expression is a criterion, or part of a criterion, having a node expressions immediately followed in-step with an "at" character (ampersand) serving as a succeeding node expression suffixed denoting that it should match all elements of all matching arrays. It is a string which can contain letters, numbers, and/or underscores, followed in-step with a node expression whose string is a single ampersand character.
Wildcard Expression
*
*.*.first
*.*.first
patient.*.first
patient.*.last
The wildcard expression is a criterion, or part of a criterion, which matches against a single node having a single "star" character match and represent one or more non-period characters. It is a string which can contain letters, numbers, underscores, and/or a single star character.
Greedy-Wildcard Expression
**
patient.**
*.@.**
The greedy-wildcard expression is a criterion, or part of a criterion, which matches against any multitude of nodes having a double "star" character match and represent zero or more of any character. It is a string which can contain letters, numbers, underscores, and/or a double star character.
AUTHOR
Al Newkirk <anewkirk@ana.io>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Al Newkirk.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.