NAME
CORBA::IDLtree - OMG IDL to symbol tree translator
VERSION
Version 2.05
SYNOPSIS
Subroutine Parse_File is the universal entry point (to be called by the main program.) It takes an IDL file name as the input parameter and parses that file, constructing one or more symbol trees for the outermost declarations encountered. It returns a reference to an array containing references to those trees. In case of errors during parsing, Parse_File returns 0.
Usage:
use CORBA::IDLtree;
my $ref_to_array_of_outermost_declarations = CORBA::IDLtree::Parse_File("myfile.idl");
$ref_to_array_of_outermost_declarations or die "File had syntax errors\n";
foreach my $node (@$ref_to_array_of_outermost_declarations) {
# Query $node->[TYPE] to find out what each node is;
# use $node->[SUBORDINATES] according to the $node->[TYPE].
# For example:
if ($node->[CORBA::IDLtree::TYPE] == CORBA::IDLtree::MODULE) {
foreach my $subnode @{$node->[CORBA::IDLtree::SUBORDINATES]}) {
# Assuming your "sub process" codes your business logic:
&process($subnode);
}
} elsif ($node->[CORBA::IDLtree::TYPE] == CORBA::IDLtree::...) {
# And so on, decode and process all the types you need ...
# For further details see the demo application in subdir demoapp.
}
}
STRUCTURE OF THE SYMBOL TREE
A "thing" in the symbol tree can be either a reference to a node, or a reference to an array of references to nodes.
Each node is a six element array with the elements
[0] => TYPE (MODULE|INTERFACE|STRUCT|UNION|ENUM|TYPEDEF|CHAR|...)
[1] => NAME
[2] => SUBORDINATES
[3] => ANNOTATIONS
[4] => COMMENT
[5] => SCOPEREF
The TYPE
element, instead of holding a type ID number (see the following list under SUBORDINATES
), can also be a reference to the node defining the type. When the TYPE
element can contain either a type ID or a reference to the defining node, we will call it a type descriptor. Which of the two alternatives is in effect can be determined via the isnode
function.
The NAME
element, unless specified otherwise, simply holds the name string of the respective IDL syntactic item.
The SUBORDINATES
element depends on the type ID:
- MODULE or INTERFACE
-
Reference to an array of nodes (symbols) which are defined within the module or interface. In the case of
INTERFACE
, element [0] in this array will contain a reference to a further array which in turn contains references to the parent interface(s) if inheritance is used, or the null value if the current interface is not derived by inheritance. Element [1] is the "local/abstract" flag which isABSTRACT
for abstract interfaces, orLOCAL
for interfaces declared local. - INTERFACE_FWD
-
Reference to the node of the full interface declaration.
- STRUCT or EXCEPTION
-
Reference to an array of node references representing the member components of the struct or exception. Each member representative node is a quintuplet consisting of (
TYPE
,NAME
, <dimref>,ANNOTATIONS
,COMMENT
). The <dimref> is a reference to a list of dimension numbers, or is 0 if no dimensions were given. In case of STRUCT, the first element may be a reference to a further STRUCT node instead of the reference to quintuplet. In this case, the first element indicates the IDL4 parent struct type of the current struct. The function isnode() can be used for detecting this case. - UNION
-
Similar to
STRUCT
/EXCEPTION
, reference to an array of nodes. For union members, the member node has the same structure as for STRUCT/EXCEPTION. However, the first node contains a type descriptor for the discriminant type. The switch node does not follow the usual quadruplet structure of members; it is a single item. TheTYPE
of a member node may also beCASE
orDEFAULT
. When the TYPE is CASE or DEFAULT, this means that the following member node will be the union branch controlled by the CASE or DEFAULT. ForCASE
, theNAME
is unused, and theSUBORDINATES
contains a reference to a list of the case values for the following member node. ForDEFAULT
, both theNAME
and theSUBORDINATES
are unused. - ENUM
-
Reference to an array describing the enum value literals. Each element in the array is a reference to a triplet (three element array): The first element in the triplet is the enum literal value. The second element is a reference to an array of annotations as described in the
ANNOTATIONS
documentation (see below). The third element is a reference to the trailing comment list. - TYPEDEF
-
Reference to a two-element array: element 0 contains a reference to the type descriptor of the original type; element 1 contains a reference to an array of dimension expressions, or the null value if no dimensions are given. When given, the dimension expressions are plain strings.
- SEQUENCE
-
As a special case, the
NAME
element of aSEQUENCE
node does not contain a name (as sequences are anonymous types), but instead is used to hold the bound number. If the bound number is 0 then it is an unbounded sequence. TheSUBORDINATES
element contains the type descriptor of the base type of the sequence. This descriptor could itself be a reference to aSEQUENCE
defining node (that is, a nested sequence definition.) - BOUNDED_STRING
-
Bounded strings are treated as a special case of sequence. They are represented as references to a node that has
BOUNDED_STRING
orBOUNDED_WSTRING
as the type ID, the bound number in theNAME
, and theSUBORDINATES
element is unused. - CONST
-
Reference to a two-element array. Element 0 is a type descriptor of the const's type; element 1 is a reference to an array containing the RHS expression symbols.
- FIXED
-
Reference to a two-element array. Element 0 contains the digit number and element 1 contains the scale factor. The
NAME
component in aFIXED
node is unused. - VALUETYPE
-
Uses the following structure:
[0] => $is_abstract (boolean) [1] => reference to a tuple (two-element list) containing inheritance related information: [0] => $is_truncatable (boolean) [1] => \@ancestors (reference to array containing references to ancestor nodes) [2] => \@members: reference to array containing references to tuples (two-element lists) of the form: [0] => 0|PRIVATE|PUBLIC A zero for this value means the element [1] contains a reference to a declaration, such as a METHOD or ATTRIBUTE. In case of METHOD, the first element in the method node subordinates (i.e., the return type) may be FACTORY. However, unlike interface methods, the last element is _not_ a reference to the 'raises' list. Support for 'raises' of valuetype methods may be added in a future version. [1] => reference to the defining node. In case of PRIVATE or PUBLIC state member, the SUBORDINATES of the defining node contains a dimref (reference to dimensions list, see STRUCT.)
- VALUETYPE_BOX
-
Reference to the defining type node.
- VALUETYPE_FWD
-
Reference to the node of the full valuetype declaration.
- NATIVE
-
Subordinates unused.
- ATTRIBUTE
-
Reference to a two-element array; element 0 is the read- only flag (0 for read/write attributes), element 1 is a type descriptor of the attribute's type.
- METHOD
-
Reference to a variable length array; element 0 is a type descriptor for the return type. Elements 1 and following are references to parameter descriptor nodes with the following structure:
elem. 0 => parameter type descriptor elem. 1 => parameter name elem. 2 => parameter mode (IN, OUT, or INOUT)
The last element in the variable-length array is a reference to the "raises" list. This list contains references to the declaration nodes of exceptions raised, or is empty if there is no "raises" clause.
- INCFILE
-
Reference to an array of nodes (symbols) which are defined within the include file. The Name element of this node contains the include file name.
- PRAGMA_PREFIX
-
Subordinates unused.
- PRAGMA_VERSION
-
Version string.
- PRAGMA_ID
-
ID string.
- PRAGMA
-
This is for the general case of pragmas that are none of the above, i.e. pragmas unknown to IDLtree. The
NAME
holds the pragma name, andSUBORDINATES
holds all further text appearing after the pragma name. - REMARK
-
The
NAME
of the node contains the starting line number of the comment text. TheSUBORDINATES
component contains a reference to a list of comment lines. The comment lines are not newline terminated. The source line number of each comment line can be computed by adding the starting line number and the array index of the comment line. By default,REMARK
nodes will not be generated; generation ofREMARK
nodes can be enabled by setting the $enable_comments global variable to non zero.
The ANNOTATIONS
element holds the reference to an array of annotation nodes if IDL4 style annotations are present (if no annotations are present then the ANNOTATIONS element holds 0). Each entry in this array is an array reference. The first element in the array referenced is a reference to an entry in @annoDefs (see comments at declaration of @annoDefs). The following elements contain the concrete values for the parameters, in the order as defined by the entry in @annoDefs. If the user omitted the value of the parameter then the default as specified by the entry in @annoDefs is filled in.
The COMMENT
element holds the comment text that follows the IDL declaration on the same line. Usually this is just a single line. However, if a multi- line comment is started on the same line after a declaration, the multi-line comment may extend to further lines - therefore we use a list of lines. The lines in this list are not newline terminated. The COMMENT
field is a reference to a tuple of starting line number and reference to the line list, or contains 0 if no trailing comment is present at the IDL item.
The SCOPEREF
element is a reference back to the node of the module or interface enclosing the current node. If the current node is already at the global scope level then the SCOPEREF
is 0. Special case: For a reopened module, the SCOPEREF
points to the previous opening of the same module. In case of multiple reopenings, each reopening points to the previous opening. The SCOPEREF
of the initial module finally points to the enclosing scope. All nodes have this element except for the parameter nodes of methods and the component nodes of structs/unions/exceptions.
CLASS VARIABLES
Variables that can be set by client code
- @CORBA::IDLtree::include_path
-
Paths where to look for included IDL files.
- %CORBA::IDLtree::defines
-
Symbol definitions for preprocessor.
- $CORBA::IDLtree::cache_trees
-
Values 0 or 1, default 0. By default, do not cache trees of
#include
d files. - $CORBA::IDLtree::enable_comments
-
Values 0 or 1, default 0. By default, do not generate
REMARK
nodes. - $CORBA::IDLtree::struct2vt
-
Values 0 or 1, default 0. Change struct into equivalent valuetype
- $CORBA::IDLtree::vt2struct
-
Values 0 or 1, default 0. Change valuetype into equivalent struct
- $CORBA::IDLtree::cache_statistics
-
Values 0 or 1, default 0. Print cache statistics
- $CORBA::IDLtree::long_double_supported
-
Values 0 or 1, default 0. Switch on support for IDL
long double
. - $CORBA::IDLtree::union_default_null_allowed
-
Values 0 or 1, default 1. Switch off permission that a
union
'sdefault
branch may be empty. - $CORBA::IDLtree::leading_underscore_allowed
-
Value 1 will remove the leading underscore. Value 2 will preserve the leading underscore.
- $CORBA::IDLtree::permissive
-
Values 0 or 1, default 0. By default, misuse of IDL keywords as identifiers is a hard error.
Variables written by CORBA::IDLtree
These are to be considered read-only from outside:
- $CORBA::IDLtree::n_errors
-
Cumulative number of errors for a
Parse_File
call. - $CORBA::IDLtree::global_idlfile
-
Copy of filename passed into most recent call of sub Parse_File
CONSTANTS
Constants for accessing the elements of a node
- Constants for indexing the elements of a node
-
As explained in STRUCTURE OF THE SYMBOL TREE, each node is represented as a six element array. These constants are intended for indexing the array:
sub TYPE () { 0 } sub NAME () { 1 } sub SUBORDINATES () { 2 } sub MODE () { 2 } sub ANNOTATIONS () { 3 } sub COMMENT () { 4 } sub SCOPEREF () { 5 }
The constant
MODE
is an alias ofSUBORDINATES
for method parameter nodes. - Method parameter modes
-
sub IN () { 1 } sub OUT () { 2 } sub INOUT () { 3 }
- Meanings of the TYPE entry in the symbol node
-
sub NONE () { 0 } # error/illegality value sub BOOLEAN () { 1 } sub OCTET () { 2 } sub CHAR () { 3 } sub WCHAR () { 4 } sub SHORT () { 5 } sub LONG () { 6 } sub LONGLONG () { 7 } sub USHORT () { 8 } sub ULONG () { 9 } sub ULONGLONG () { 10 } sub FLOAT () { 11 } sub DOUBLE () { 12 } sub LONGDOUBLE () { 13 } sub STRING () { 14 } sub WSTRING () { 15 } sub OBJECT () { 16 } sub TYPECODE () { 17 } sub ANY () { 18 } sub FIXED () { 19 } # node sub BOUNDED_STRING () { 20 } # node sub BOUNDED_WSTRING () { 21 } # node sub SEQUENCE () { 22 } # node sub ENUM () { 23 } # node sub TYPEDEF () { 24 } # node sub NATIVE () { 25 } # node sub STRUCT () { 26 } # node sub UNION () { 27 } # node sub CASE () { 28 } sub DEFAULT () { 29 } sub EXCEPTION () { 30 } # node sub CONST () { 31 } # node sub MODULE () { 32 } # node sub INTERFACE () { 33 } # node sub INTERFACE_FWD () { 34 } # node sub VALUETYPE () { 35 } # node sub VALUETYPE_FWD () { 36 } # node sub VALUETYPE_BOX () { 37 } # node sub ATTRIBUTE () { 38 } # node sub ONEWAY () { 39 } # implies "void" as the return type sub VOID () { 40 } sub FACTORY () { 41 } sub METHOD () { 42 } # node sub INCFILE () { 43 } # node sub PRAGMA_PREFIX () { 44 } # node sub PRAGMA_VERSION () { 45 } # node sub PRAGMA_ID () { 46 } # node sub PRAGMA () { 47 } # node sub REMARK () { 48 } # node sub NUMBER_OF_TYPES () { 49 }
The constant
FACTORY
can only occur as the return type of a method in a valuetype. - Interface/valuetype flag values
-
sub ABSTRACT { 1 } sub LOCAL { 2 } sub TRUNCATABLE { 2 } sub CUSTOM { 3 }
- Valuetype member flags
-
sub PRIVATE { 1 } sub PUBLIC { 2 }
SUBROUTINES
Parse_File
Parses the file name given as argument. Returns reference to array of nodes representing the top level (global) declarations in the file. Returns 0 if the file had syntax errors. Parse_File
writes the error messages to STDERR
.
Dump_Symbols
Symbol tree dumper (for debugging etc.) reconstructs the IDL source notation from the parsed symbol tree. Parameters:
Reference to a symbol array (return value from a previous call to Parse_File).
Optional parameter controlling the output:
If given as string then it is the name of a file into which to dump the IDL source.
If given as array reference then the IDL source will be placed in the referenced array, one line per element, where each line is not newline terminated.
If the optional parameter is not given or is given as
undef
then the IDL source will be dumped toSTDOUT
.
is_elementary_type
Given a node reference, returns the type constant if the node prepresents an elementary type. Returns 0 if the type is not elementary.
predef_type
Given a type name (as string), returns the type constant if the type name is that of an elementary type. Returns 0 if the type is not elementary.
isnode
Given a "thing", returns 1 if it is a reference to a node, 0 otherwise.
is_scope
Given a "thing", returns 1 if it's a ref to a MODULE
, INTERFACE
, or INCFILE
node.
find_node
Looks up a name in the symbol tree(s) constructed so far. Returns the node ref if found, else 0.
typeof
Given a type descriptor, returns the type as a string in IDL syntax.
set_verbose
Call this to make the parser tell us what it's doing.
is_a
Determine if typeid is of given type, recursing through TYPEDEF
s.
root_type
Get the original type of a TYPEDEF
, i.e. recurse through all non array TYPEDEF
s until the original type is reached.
is_pragma
Return 1 if the given type constant or node is a pragma.
files_included
Returns an array with the names of files #included.
get_scalar_default
Get default value for type. Uses comment directives object if available.
idlsplit
Splits a given IDL expression into its individual tokens. Returns the tokens as a list. Example: The call
idlsplit("(m_a::myconst+1.0) / scale")
returns the list
"(", "m_a::myconst", "+", "1.0", ")", "/", "scale"
is_valid_identifier
Returns 1 if the argument is a valid IDL identifier.
scoped_name
Expects a symbol node as the input argument and returns its fully qualified name in IDL syntax.
collect_includes
Utility for collecting #include
d files. Parameters:
Reference to node list to analyze.
Reference to hash in which to add the includefile names encountered. The includefile names are added as key fields of the hash. The value fields are not used.
get_numeric
Computes numeric value of expression.
enum_literals
The SUBORDINATES
of ENUM
contains more than just the actual enum literal values (the additional data are: annotations, trailing comments). This is a convenience subroutine which returns the net literals of the given $enumnode[SUBORDINATES]
.
AUTHOR
Oliver M. Kellogg, <okellogg at users.sourceforge.net>
BUGS
Please report any bugs or feature requests to bug-corba-idltree at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=CORBA-IDLtree. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc CORBA::IDLtree
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
Thanks to Heiko Schroeder for contributing.
LICENSE AND COPYRIGHT
Copyright (C) 1998-2020, Oliver M. Kellogg
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.