NAME
Embedix::ECD - represent Embedix Component Descriptions as a tree of perl objects
SYNOPSIS
instantiate from a file
my $ecd = Embedix::ECD->newFromFile('busybox.ecd');
my $other_ecd = Embedix::ECD->newFromFile('tinylogin.ecd');
access nodes
my $busybox = $ecd->System->Utilities->busybox;
build from scratch
my $server = Embedix::ECD::Group->new(name => 'Server');
my $www = Embedix::ECD::Group->new(name => 'WWW');
my $apache = Embedix::ECD::Component->new (
name => 'apache',
srpm => 'apache',
prompt => 'Include apache web server?',
help => 'The most popular http server on the internet',
);
$ecd->addChild($server);
$ecd->Server->addChild($www);
$ecd->Server->WWW->addChild($apache);
get/set attributes
my $srpm = $busybox->srpm();
$busybox->help('i am busybox of borg -- unix will be assimilated.');
$busybox->requires([
'libc.so.6',
'ld-linux.so.2',
'skellinux',
]);
combine Embedix::ECD objects together
$ecd->mergeWith($other_ecd);
print as text
print $ecd->toString;
print as XML
use Embedix::ECD::XMLv1 qw(xml_from_cons);
print $ecd->toXML(shiftwidth => 4, dtd => 'yes');
my $cons = Embedix::ECD->consFromFile('minicom.ecd');
print xml_from_cons($cons);
REQUIRES
- Parse::RecDescent
-
for the ECD parser
- Data::Dumper
-
for debugging
- Tie::IxHash
-
for preserving the insertion order of children while retaining
O(1)
named access (at the expense of memory). - Pod::Usage
-
bin/ecd2xml
uses this to generate its help message.
DESCRIPTION
Embedix::ECD allows one to represent ECD files as a tree of perl objects. One can construct objects by parsing an ECD file, or one can build an ECD object from scratch by combining instances of Embedix::ECD and its subclasses. These objects can then be turned back into ECD files via the toString()
method.
ECD stands for Embedix Component Description, and its purpose is to contain meta-data regarding packages (aka components) in the Embedix distribution. ECD files contain much of the same data a .spec file does for an RPM. A major difference however is that ECD files do not contain building instructions whereas .spec files do. Another major difference between .spec files and ECD files is the structure. ECD files are hierarchically structured whereas .spec files are comparatively flat.
The ECD format reminds me of the syntax for Apache configuration files. Items are tag-delimited (like in XML) and attributes are found between these tags. Comments are written by prefixing them with /^\s*#/. Unlike apache configurations, attribute names and values are separated by an "=" sign, whereas in apache the first token is the attribute name and everything after that (sans leading whitespace) and up to the end of the line is the attribute's value. Also, unlike apache configurations, attributes may also be enclosed in tags, whereas in apache tags are used only to describe nodes.
ECD files look like pseudo-XML with shell-styled comments.
METHODS
Constructors
There are two types of constructors provided by this class. The first kind of constructor begins with "new" and returns an Embedix::ECD object. There is another kind of constructor that begins with "cons" and returns the syntax tree as nested arrayrefs.
I realized that creating an object of the syntax tree takes a long time (especially for long ECD files). I also realized that sometimes, the simple nested arrayref is useful enough on its own. It also has the nice property of retaining comments whereas the object constructor disposes of comments. I thought if ECD files were ever to be translated into XML, it'd be nice to be able to keep the comments. These factors convinced me that it would be useful to have these 2 kinds of constructors.
- $ecd = Embedix::ECD->new(key => $value, ...)
-
This returns an Embedix::ECD object. It can be initialized with named parameters which represent the attributes the object should have. The set of valid attributes is:
name # name is mandatory! type value default_value range help prompt srpm specpatch static_size min_dynamic_size storage_size startup_time build_vars provides requires keeplist choicelist trideps requiresexpr if
Their meanings are explained under the Attributes heading.
The following 5 constructors rely on a Parse::RecDescent parser. When they encounter a syntax error they will die
, so be sure to wrap them around an eval
block.
- $ecd = Embedix::ECD->newFromCons($cons)
-
This returns an Embedix::ECD object from a nested arrayref.
- $ecd = Embedix::ECD->newFromString($string)
-
This returns an Embedix::ECD object from a string in ECD format.
- $ecd = Embedix::ECD->newFromFile($filename)
-
This returns an Embedix::ECD object from an ECD file.
- $cons = Embedix::ECD->consFromString($string)
-
This returns a nested arrayref from a string in ECD format.
- $cons = Embedix::ECD->consFromFile($filename)
-
This returns a nested arrayref from an ECD file.
(This next constructor is an anomaly.)
- $ecd_parser = Embedix::ECD->parser()
-
This returns an instance of Parse::RecDescent configured to understand the ECD grammar. This instance is a singleton, so you will receive the same instance every time.
Nodes
Nodes are the fundamental building block of the tree structure in an ECD file. Nodes are containers of attributes and other nodes. No matter what, all nodes will have a "name" attribute. This is the key feature that makes nodes distiguishable from attributes.
This is a node
<AUTOVAR embedix_ui-VGAOPT>
TYPE=string
DEFAULT_VALUE=785
</AUTOVAR>
This is an attribute
<IF>
( ( EBXDUP_CONFIG_USB_BANDWIDTH == "y" )
LET ( $VALUE = "y" ) )
||
( ( ( CONFIG_USB != "n" )
&& ( CONFIG_EXPERIMENTAL != "y" ) )
LET ( $VALUE = "n" ) )
</IF>
The distinction is in the opening tag. The autovar has a second string in it which represents the node's name whereas the if has nothing which means that it is an attribute of the node it is contained in.
There are 5 (not 4) types of nodes.
- the root node | Embedix::ECD
-
This node is implicit but very real. When invoking any of the constructors that begin with "newFrom", one will get back an Embedix::ECD object within which the rest of the ECD data will be contained.
- Group | Embedix::ECD::Group
-
Their purpose is to establish a hierarchy of components under meaningful subheadings such as "Server/WWW" or "System/Utilities". Their main use is as containers of other nodes.
- Component | Embedix::ECD::Component
-
A component node represents a package in the Embedix distribution.
- Option | Embedix::ECD::Option
-
An option node is almost always contained under a component node. The purpose of an option is to provide a point of configurability for a package.
- Autovar | Embedix::ECD::Autovar
-
What exactly is this?
Accessing Child Nodes
The following are accessor methods for child nodes.
- $child_ecd = $ecd->getChild($name)
-
This returns a child node with the given $name or undef if no such child exists.
- $child_ecd = $ecd->n($name)
-
n()
is an alias forgetChild()
. "n" stands for "node" and is a lot easier to type than "getChild".$ecd->n('System') ->n('Utilities') ->n('busybox') ->n('long-ass-option-name-with-redundant-information');
- $ecd->addChild($obj)
-
This adds a child to the current node.
- @child_ecd = $ecd->getChildren()
-
This returns a list of all child nodes.
Accessing Child Nodes via AUTOLOAD
The name of a node can be used as a method. This is what makes it possible to say something like:
my $busybox = $ecd->System->Utilities->busybox;
and get back the Embedix::ECD::Component object that contains the information for the busybox package. "System", "Utilities", and "busybox" are not predefined methods in Embedix::ECD or any of its subclasses, so they are delegated to the AUTOLOAD method. The AUTOLOAD method will try to find a child with the same name as the undefined method and it will return it if found.
I have not yet decided whether the AUTOLOAD should die when a child is not found. Currently undef is returned in this situation.
One annoyance is that many nodes have names with "-" in them. These cannot be AUTOLOADed, because method names may not have a "-" in perl. When accessing such nodes, use the getChild()
method.
Attributes
If nodes are objects, then attributes are a node's instance variables. All attributes may be single-valued or aggregate-valued. Single-valued attributes are non-reference scalar values, and aggregate attributes are non-reference scalar values enclosed within an arrayref.
A single valued attribute:
my $bbsed = $busybox->n('Misc-utilities')->n('keep-bb-sed');
$bbsed->provides('sed');
The same attribute as an aggregate:
$bbsed->provides([ 'sed' ]);
Semantically, these are equivalent. The main difference one will notice is cosmetic. When the toString()
method is called, the single-valued one will look like:
PROVIDES=sed
and the aggregate valued provides will look like:
<PROVIDES>
sed
</PROVIDES>
Again, these two expressions mean the same thing. An aggregate of one is interpreted just as if it were a single value.
Aggregates become useful when attributes needs to have a list of values.
$busybox->n('compile-time-features')->n('enable-bb-feature-use-inittab')->requires ([
'keep-bb-init',
'inittab',
'/bin/sh',
]);
This will be rendered by toString()
as
<REQUIRES>
keep-bb-init
inittab
/bin/sh
</REQUIRES>
There are accessors for attributes that work like your typical perl getters and setters. That is, when called without a parameter, the method behaves as a getter. When called with a parameter, the method behaves as a setter and the value of the parameter is assigned to the attribute.
Get
my $name = $busybox->name();
Set
$busybox->name('busybox');
Accessors For Single-Valued Attributes
These are accessors for attributes that are typically single-valued.
- $ecd->name()
-
This is the name of the node.
- $ecd->type()
-
This is the type of the node. This is usually (always?) seen in the context of an option and it can contain values such as "bool", "int", "int.hex", "string", and "tridep".
- $ecd->value()
-
This is the value of a node which must be something appropriate for its type.
- $ecd->default_value()
-
This is the value taken by the node if value is not defined.
- $ecd->range()
-
For the numerical types, it may be desirable to limit the range of values that may be assigned such that
value()
will always be meaningful. The use of this attribute has only been observed in linux.ecd. - $ecd->help()
-
This often contains prose regarding the current node. I think it would be nice if it were possible to use an alternative form of mark-up language inside these sections. (HTML, for instance).
- $ecd->prompt()
-
The value in prompt is used in TargetWizard to pose a question to the user regarding whether he/she wants to enable an option or not.
- $ecd->srpm()
-
This contains the name of the source RPM sans version information and the file extension. This attribute almost always has the same value as
name()
. - $ecd->specpatch()
-
This attribute is only meaningful within the context of a component. Specpatches are applied to .spec files just prior to the building of a component. They are often used to configure the compilation of a component. The busybox package provides a good example of this in action.
- $ecd->static_size()
-
This is the sum of .text, .data, and .bss for an option and/or component.
- $ecd->min_dynamic_size()
-
The very least a program will
malloc()
during its execution. - $ecd->storage_size()
-
This is the amount of space this component and/or option would consume on a filesystem.
- $ecd->startup_time()
-
The amount of time (in what metric?) from the time a program is executed up to the point in time when the program becomes useful.
- $ecd->requiresexpr()
-
This contains a C-like expression describing node dependencies.
- $ecd->if()
-
I didn't know if using a keyword as a method name would be legal, but apparently it is. I also wonder if more than on 'if' statement is allowed per node.
Accessing Aggregate Attributes
The following are attributes that frequently contain aggregate values. When setting attributes with aggregate values, enclose the values within an arrayref.
- $ecd->build_vars()
-
This specifies a list of transformations that can be applied to a .spec file prior to building.
- $ecd->provides()
-
This is a list of symbolic names that a node is said to be able to provide. For example, grep in busybox provides grep. GNU/grep also provides grep. According to TargetWizard, these two cannot coexist on the same instance of an Embedix distribution, because they both provide grep.
- $ecd->requires()
-
This is a list of libraries, files, provides, and other nodes required by the current node.
- $ecd->keeplist()
-
This is a list of files and directories provided by a component or option.
- $ecd->choicelist()
-
This is used for options in the kernel.
- $ecd->trideps()
-
This is used for options in the kernel.
Accessors That Take Named Attributes
The most general kind of accessor takes the name of an attribute as a parameter and gets or sets it.
- $val = $ecd->getAttribute($name)
-
This gets the attribute called $name.
- $ecd->setAttribute($name, $value)
-
This sets the attribute called $name to $value.
Utility Methods
- $string = $ecd->toString(indent => 0, shiftwidth => 4)
-
This will render an $ecd object as ASCII in ECD format. JavaScript programmers may find this familiar. An interesting deviation from the JavaScript version of
toString()
is that this one will accept optional parameters that allow one to control the rendering options.- indent
-
This is the number of spaces the first level nodes should be indented. The default value is 0.
- shiftwidth
-
This is the number of spaces a nested item should be indented. The default value is 4.
- $ecd->mergeWith($the_other_ecd)
-
This combines the information contained in $the_other_ecd with $ecd. In the event that there is conflicting information, the information in $the_other_ecd takes precedence over what already existed in $ecd.
- $depth = $ecd->getDepth()
-
This method returns how many levels deep one is in the object tree. The root level is considered 0.
- $name = $ecd->getNodeClass()
-
This returns the node class (ie. Group, Component, Option, or Autovar) of an Embedix::ECD object. It differs from the ref() operator in that the string "Embedix::ECD::" is omitted from the returned value.
- $opt_hash_ref = $ecd->getFormatOptions(@opt);
-
This is used internally by implementations of
toString()
to compute and return spacing information based on the formatting parameters passed to it. - $string = $ecd->attributeToString($opt_hash_ref);
-
This is used internally by implementations of
toString()
to render a node's attributes.
CLASS VARIABLES
You shouldn't be touching these. This is just here for your information.
- Embedix::ECD::__grammar
-
This scalar contains the grammar for ECD files.
- Embedix::ECD::__parser
-
This contains an instance of Parse::RecDescent.
DIAGNOSTICS
- $line: was expecting $TAGNAME, but found $CRAP instead.
-
This error occurs whenever an imbalanced tag is found.
- $line: $ATTRIBUTE not allowed in $NODE_TYPE
-
not implemented
BUGS
This parser becomes exponentially slower as the size of ECD data increases. busybox.ecd takes 30 seconds to parse. Don't even try to parse linux.ecd -- it will sit there for hours just sucking CPU before it ultimately fails and gives you back nothing.
I don't know if there's anything I can do about it.
COPYRIGHT
Copyright (c) 2000 John BEPPU. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
John BEPPU <beppu@lineo.com>
SEE ALSO
-
ecdlib.py(3)
,config2ecd(1)
,tw(1)
-
Embedix::ECD::XMLv1(3pm)
- CML2
-
The Configuration Menu Language is a constraint-based language developed by Eric Raymond in an attempt to simplify the process of configuring the Linux kernel.
http://www.tuxedo.org/~esr/kbuild/
- CDL
-
The Component Description Language was developed by Cygnus to support configurable compilation for the eCos operating system.
http://sourceware.cygnus.com/ecos/
- the lastest version
-
http://opensource.lineo.com/cgi-bin/cvsweb/pm/Embedix/ECD/