NAME

DDC::Query::Parser - extendable full-text index using mysql: high-level query parser

SYNOPSIS

##========================================================================
## PRELIMINARIES

use DDC::Query::Parser;

##========================================================================
## Constructors etc.

$qp    = $CLASS_OR_OBJ->new(%args);
$undef = $qp->free();                ##-- explicit destruction REQUIRED!

##========================================================================
## API: High-level Parsing

$undef = $qp->reset();
$query_or_undef = $qp->parse(@query_strings);

##========================================================================
## API: Mid-level: Query Generation API

$q = $parser->newQuery(@args);

##========================================================================
## API: Low-level: Lexer/Parser Connecton, Error Reporting, etc.

\&yylex_sub   = $qp->_yylex_sub();
\&yyerror_sub = $qp->_yyerror_sub();
$errorString  = $qp->setError($errorCode,\%userMacros);

DESCRIPTION

DDC::Query::Parser is a high-level parser for user queries expressed in the DDC query language. It uses a native perl scanner (DDC::Query::yylexer) and a Parse::Yapp generated parser (DDC::Query::yyparser) for low-level parsing.

Constructors etc.

new
$qp = $CLASS_OR_OBJ->new(%args);

Constructor. NOTE: you should probably call free() before destroying the returned object to be safe.

Object structure / known %args:

{
 ##-- Status flags
 errstr => $current_errstr, ##-- false indicates no error

 ##-- Underlying lexer/parser pair
 lexer   => $yylexer,   ##-- a DDC::Query::yylexer object
 parser  => $yyparser,  ##-- a DDC::Query::yyparser object
 yydebug => $mask,      ##-- yydebug value

 ##-- Closures
 yylex    => \&yylex,   ##-- yapp-friendly lexer sub
 yyerror  => \&yyerror, ##-- yapp-friendly error sub
}
free
$undef = $qp->free();

Performs required pre-destruction cleanup (trims circular references, etc.), in particular: clears $qp itself, as well as $qp->{parser}{USER}, which makes $qp subsequently useless, but destroyable.

useIndex
$qp = $qp->useIndex($index);

Sets up parser to use the DDC index $index.

API: High-level Parsing

The following methods comprise the top-level parsing API.

reset
undef = $qp->reset();

Reset all parse-relevant data structures in preparation for parsing a new query.

parse
$query_or_undef = $qp->parse(@query_strings);
$query_or_undef = $qp->parse(\*query_fh)

Parse and return a user query as a DDC::Query::Base object (or subclass) from a (list of) string(s) [first form], or from an open filehandle [second form]. If an error is encountered, parse() returns undef.

API: Mid-level: Query Generation API

The following methods comprise the mid-level parsing API. Users should never need to call these methods directly, but they may be useful if you are deriving a new parser (sub)class, e.g. implementing an alternate query syntax.

newQuery
$q = $parser->newQuery(@args);

Wrapper for $parser->{index}->newQuery(@args)

finishQuery
$q = $parser->finishQuery($srcQuery);

Imposes default 'hit' restrictor on parsed query $srcQuery, and other finalizing touches (join insertion, variable merge, independent variable check, meta expansion).

sqlQuoteString
$quotedStr = $qp->sqlQuoteString($str);

Adds single quotes around $str and escapes any string-internal single quotes.

newVariable
$varLabel = $qp->newVariable(%args);

Wrapper for $qp->{qtmp}->newVariable(), with different default semantics. Known %args:

label => $varLabel,    ##-- variable label           (default=(generated))
table => $varTable,    ##-- variable table name      (default=$qp->{qtmp}{default_table})
tok   => $bool,        ##-- is this var independent? (default=true)
newReference
$varLabel = $qp->newReference(src=>$srcVarName, ref=>$refColName);

Wrapper for newVariable() which creates a new variable dependent on $srcVarName which will be joined to the table referenced by the 'ref' field $refColName, for transparent de-referencing in queries. Implementation handles variable aliasing by naming conventions, and performs some basic sanity checks.

parseReference
[$tokVar,$attr,...] = $CLASS_OR_OBJ->parseReference($varName);

Parses a reference returned by the low-level parser as a dot-separated string of the form "${tokenVarName}.${refAttr1}.(...).${refAttrN}.${attrName}".

reference2token
$tokenVarName = $CLASS_OR_OBJ->reference2token($varName);

Hack: get the name of the independent (token) variable associated with the dot-separated string $varName.

constantQuery
$q = $parser->constantQuery($sqlWhereFragment);

Simple constant query (boolean true or false). Can also be used to add literal SQL fragments to a query object.

literalQuery
$q = $parser->literalQuery($literal_text);

Handler for "literal" single-word or -string queries. Default is an atribute query on $q->{defaultAttribute} via $q->{defaultOp} with value $literal_text.

Sets $q->{tok} to the newly generated variable as a side-effect.

soundsLikeQuery
$q = $qp->soundsLikeQuery($attributeId,$soundsLikeText);

Handler for "sounds-like" queries over $attributeId with (orthographic) value $soundsLikeText. Default uses path 'type.pho' on $varName, 'pho' on $soundsLikeText (implicit table: 'type').

attributeQuery
$q = $parser->attributeQuery($attributeId, $sqlOpFragment, $sqlValueFragment);

Handler for generic attribute queries, where $attributeId = [$varName,$attrName].

attributeValue
$sqlStr = $qp->attributeValue($attributeIdOrValue);

Returns an SQL-string representing $attributeIdOrValue, where $attributeIdOrValue is one of the following:

  • a literal value (numeric or string, pre-parsed)

  • a pair [ $varName, $attrName ]

parseReferencePath
$varName  = $qp->parseReferencePath([$varNameOrUndef]);
$varNameN = $qp->parseReferencePath([$varNameOrUndef,@refNames])

Calls newVariable() if $varNameOrUndef is undefined to allocate an independent base variable, and calls newReference() for each reference named in @refNames to perform nested variable de-referencing.

parseAttributePath
[$varName,$attrPath] = $qp->parseAttributePath([@refPath,$attrName]);

Wrapper which calls parseReferencePath() on non-final components of [@refPath,$attrName] and returns and $attributeId ARRAY-ref [$varName,$attrPath] representing the (de-referenced) argument array.

precedesQuery
$q = $parser->precedesQuery($q1,$q2);

Enforces restriction that all 'tok' variables in $q1 precede all those in $q2 (by primary key).

sqlGreatestId
$sqlFragment = $qp->sqlGreatestId($query,\@tokenVars);

Returns SQL fragment representing the value of the greatest primary key of any independent variable named in \@tokenVars.

sqlLeastId
$sqlFragment = $qp->sqlLeastId($query,\@tokenVars);

Returns SQL fragment representing the value of the smallest primary key of any independent variable named in \@tokenVars.

sqlMinMax
$sqlFragment = $qp->sqlMinMax($func,$query);
$sqlFragment = $qp->sqlMinMax($func,$query,\@tokenVars)

Guts for sqlGreatestId() and sqlLeastId(): returns $tokenVars[0] if only one token is specified in @tokenVars, otherwise applies SQL function $func to SQL forms of @tokenVars.

sequenceQuery
$q = $parser->sequenceQuery(\@queryList);

Handler for back-to-back ordered sequences of queries. Default implementation interprets these as serial order of independent variables' primary keys.

unearQuery
$q = $qp->unearQuery($maxDist, \@queryList)

Handles unordered 'near' queries over at most $maxDist intervening tokens.

nearQuery
$q = $qp->nearQuery($maxDist, \@queryList);

Handles ordered 'near' queries over at most $maxDist intervening tokens.

withinQuery
$q = $parser->withinQuery($srcQuery, $withinTabName);

Handles 'within' queries: imposes default 'hit' container by join clause manipulation.

metaQueryLocal
$q = $qp->metaQueryLocal($metaPath,$srcQuery,$sqlOpFragment,$sqlValueFragment);

Handler for metadata queries. Current version performs immediate expansion on all token vars in $srcQuery. This is the Right Way To Do It if metadata queries should be locally scoped.

metaQueryDelayed
undef = $qp->metaQueryDelayed($metaPath, $sqlOpFragment, $sqlValueFragment);

Alternate handler for metadata queries (currently unused). This version performs no expansion when the meta-query is parsed, but rather enqueues all metadata queries for later expansion (e.g. on $qp->finishQuery()). This would be the Right Way To Do It if metadata queries should always be interpreted as globally scoped.

expandMeta
$q_expanded = $qp->expandMeta($q);

Expands delayed metadata conditions in $qp->{meta} (if any) into $q.

API: Low-level: Lexer/Parser Connection, Error Reporting, etc.

_yylex_sub
\&yylex_sub = $qp->_yylex_sub();

Returns a Parse::Yapp-friendly lexer subroutine.

_yyerror_sub
\&yyerror_sub = $qp->_yyerror_sub();

Returns error subroutine for the underlying Yapp parser.

setError
$errorString = $qp->setError($errorCode,\%userMacros);

Should set $qp->{errstr} to expanded $errorString.

default behavior just replaces the following macros in $errorCode:

__LINE__
___COL__
__LEXTOKNAME__
__LEXTOKTEXT__
__TOKNAME__
__TOKTEXT__
__EXPECTED__

I/O: Hooks

preSaveHook
$tmpData = $obj->preSaveHook();

Sanitize object for save, returns temprorary data.

postSaveHook
undef = $obj->postSaveHook($tmpData);

(undocumented)

postLoadHook
undef = $obj->postLoadHook();

(undocumented)

ACKNOWLEDGEMENTS

Perl by Larry Wall.

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2011 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.1 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

perl(1), DDC(3perl), DDC::Query::yylexer(3perl), DDC::Query::yyparser(3perl), DDC::Query.pm(3perl)

1 POD Error

The following errors were encountered while parsing the POD:

Around line 716:

=cut found outside a pod block. Skipping to next block.