NAME

Data::Seek::Concepts - Data::Seek Concepts

VERSION

version 0.06

OVERVIEW

This document contains a simple overview of the strategy and syntax used by Data::Seek to query complex data strictures. The overall idea behind Data::Seek is to flatten/fold the data structure once, then reduce it by applying a series patterns.

FLATTENING

The first phase in the Data::Seek introspection strategy is to flatten the data structure using Hash::Flatten, producing a non-hierarchical data structure where it's keys represent endpoints within the structure.

Encoding

During the processing of flattening a data structure with nested data, the following data structure would be converted into a collection of endpoint/value pairs.

{
    'id' => 12345,
    'patient' => {
        'name' => {
            'first' => 'Bob',
            'last'  => 'Bee'
        }
    },
    'medications' => [{
        'aceInhibitors' => [{
            'name'      => 'lisinopril',
            'strength'  => '10 mg Tab',
            'dose'      => '1 tab',
            'route'     => 'PO',
            'sig'       => 'daily',
            'pillCount' => '#90',
            'refills'   => 'Refill 3'
        }],
        'antianginal' => [{
            'name'      => 'nitroglycerin',
            'strength'  => '0.4 mg Sublingual Tab',
            'dose'      => '1 tab',
            'route'     => 'SL',
            'sig'       => 'q15min PRN',
            'pillCount' => '#30',
            'refills'   => 'Refill 1'
        }],
    }]
}

Given the aforementioned data structure, the following would be the resulting flattened structure comprised of endpoint/value pairs.

{
    'id' => 12345,
    'medications:0.aceInhibitors:0.dose' => '1 tab',
    'medications:0.aceInhibitors:0.name' => 'lisinopril',
    'medications:0.aceInhibitors:0.pillCount' => '#90',
    'medications:0.aceInhibitors:0.refills' => 'Refill 3',
    'medications:0.aceInhibitors:0.route' => 'PO',
    'medications:0.aceInhibitors:0.sig' => 'daily',
    'medications:0.aceInhibitors:0.strength' => '10 mg Tab',
    'medications:0.antianginal:0.dose' => '1 tab',
    'medications:0.antianginal:0.name' => 'nitroglycerin',
    'medications:0.antianginal:0.pillCount' => '#30',
    'medications:0.antianginal:0.refills' => 'Refill 1',
    'medications:0.antianginal:0.route' => 'SL',
    'medications:0.antianginal:0.sig' => 'q15min PRN',
    'medications:0.antianginal:0.strength' => '0.4 mg Sublingual Tab',
    'patient.name.first' => 'Bob'
    'patient.name.last' => 'Bee',
}

This structure provides the endpoint strings which will be matched against using the querying strategy.

QUERYING

The second phase in the Data::Seek introspection strategy is to convert a criterion into a series of regular expressions to be sequentially applied, filtering/reducing the endpoints i.e. the keys of flatten data stricture using Data::Seek::Search, producing a data set of matching nodes or throwing an exception explaining the search failure.

Node Expression

id
patient
medications

The node expression is a criterion, or part of a criterion, which matches against a single node. It is a string which can contain letters, numbers, and/or underscores.

Step Expression

patient.name
patient.name.first
patient.name.last

The step expression is a criterion, or part of a criterion, made up of two or more node expressions separated using the period character, which matches against a nested nodes. It is a string which can contain letters, numbers, and/or underscores, separated using periods.

Index Expression

medications:0
medications:0.antianginal
medications:0.antianginal:0.name

The index expression is a criterion, or part of a criterion, having a node expressions suffixed with a semi-colon followed by a number denoting that it should only match an array which has an index corresponding to the numeric portion of the suffix. It is a string which can contain letters, numbers, and/or underscores, suffixed with a semi-colon followed by a number.

Iterator Expression

medications.@
medications.@.antianginal
medications.@.antianginal.@.name

The iteration expression is a criterion, or part of a criterion, having a node expressions immediately followed in-step with an "at" character (ampersand) serving as a succeeding node expression suffixed denoting that it should match all elements of all matching arrays. It is a string which can contain letters, numbers, and/or underscores, followed in-step with a node expression whose string is a single ampersand character.

Wildcard Expression

*
*.*.first
*.*.first
patient.*.first
patient.*.last

The wildcard expression is a criterion, or part of a criterion, which matches against a single node having a single "star" character match and represent one or more non-period characters. It is a string which can contain letters, numbers, underscores, and/or a single star character.

Greedy-Wildcard Expression

**
patient.**
*.@.**

The greedy-wildcard expression is a criterion, or part of a criterion, which matches against any multitude of nodes having a double "star" character match and represent zero or more of any character. It is a string which can contain letters, numbers, underscores, and/or a double star character.

AUTHOR

Al Newkirk <anewkirk@ana.io>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Al Newkirk.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.