SYNOPSIS
Create a table for your tree data with the 7 special columns used by Tree::Mobius. By default, these columns are mobius_a mobius_b mobius_b and mobius_d (bigint), lft and rgt (double float) and is_inner (boolean). See the add_mobius_tree_columns method to change the default names.
CREATE TABLE employees (
name TEXT NOT NULL
mobius_a BIGINT unsigned,
mobius_b BIGINT unsigned,
mobius_c BIGINT unsigned,
mobius_d BIGINT unsigned,
lft DOUBLE unsigned NOT NULL DEFAULT '1',
rgt DOUBLE unsigned,
is_inner boolean NOT NULL DEFAULT '0',
);
In your Schema or DB class add Tree::Mobius in the component list.
__PACKAGE__->load_components(qw( Tree::Mobius ... ));
Call add_mobius_tree_columns.
package My::Employee;
__PACKAGE__->add_mobius_tree_columns();
That's it, now you can create and manipulate trees for your table.
#!/usr/bin/perl
use My::Employee;
my $big_boss = My::Employee->create({ name => 'Larry W.' });
my $boss = My::Employee->create({ name => 'John Doe' });
my $employee = My::Employee->create({ name => 'No One' });
$big_boss->attach_child( $boss );
$boss->attach_child( $employee );
DESCRIPTION
This module provides methods for working with trees of data using a Möbius encoding, a variant of 'Nested Intervals' tree encoding using continued fraction. This a model to represent hierarchical information in a SQL database that takes a complementary approach of both the 'Nested Sets' model and the 'Materialized Path' model.
The implementation has been heavily inspired by a Vadim Tropashko's paper available online at http://arxiv.org/pdf/cs.DB/0402051 about the Möbius encoding.
In general, a 'Nested Intervals' model has the same advantages that 'Nested Sets' over the 'Adjacency List', that is to say obtaining all descendants requires only one SQL query rather than several recursive queries.
Additionally, a 'Nested Intervals' model has two more advantages over 'Nested Sets' :
- Encoding is not volatile (no other node has to be relabeled whenever a new node is inserted in the database).
- There are no difficulties associated with querying ancestors.
The Möbius encoding is a particular encoding scheme of the 'Nested Intervals' model that uses integer numbers economically to allow better tree scaling and directly encode the material path of a node using continued fraction (thus this model also relates somewhat with the 'Materialized Path' model).
This implementation allows you to have several root trees and corresponding trees in your database.
To allow better performance and scaling, Tree::Mobius uses the same Möbius encoding for all leaves (non inner children) of a given node. A unique Möbius encoding is later calculated only if a node becomes 'inner' (having at least one descendant).
The encoding is not volatile, but the depth is constrained by the precision of SQL float type in the right and left column. The maximum depth reachable is 8 levels with a simple SQL FLOAT, and 21 with a SQL DOUBLE. The number of inner nodes of your trees are also constrained by the maximum integer allowed for mobius_a, mobius_b, mobius_c and mobius_d column (see CAVEATS AND LIMITATIONS).
Finally, a tradeoff of DBIx::Class::Tree::Mobius over other models is the non-economical use of 7 SQL columns to encode each node.
METHODS
add_mobius_tree_columns
Declare the name of the columns for tree encoding and add them to the schema.
None of these columns should be modified outside of this module.
Multiple trees are allowed in the same table, each tree will have a unique value in the mobius_a_column.
attach_child
Attach a new child to a node.
If the child already has descendants, the entire sub-tree is moved recursively.
insert
This method is an override of the DBIx::Class' method.
The method is not meant to not be used directly but it allows one to add a parent virtual column when calling the DBIx::Class method create.
This virtual column should be set with the primary key value of the parent.
My::Employee->create({ name => 'Another Intern', parent => $boss->id });
parent
Returns a DBIx::Class Row of the parent of a node.
children
Returns a DBIx::Class resultset of all children (direct descendants) of a node.
leaf_children
Returns a DBIx::Class resultset of all children (direct descendants) of a node that do not possess any child themselves.
inner_children
Returns a DBIx::Class resultset of all children (direct descendants) of a node that possess one or more child.
descendants
Returns a DBIx::Class resultset of all descendants of a node (direct or not).
leaves
Returns a DBIx::Class resultset of all descendants of a node that do not possess any child themselves.
inner_descendants
Returns a DBIx::Class resultset of all descendants of a node that possess one or more child.
ancestors
Returns a DBIx::Class resultset of all ancestors of a node.
ascendants
An alias method for ancestors.
root
Returns a DBIx::Class resultset containing the root ancestor of a given node.
siblings
Returns a DBIx::Class resultset containing all the nodes with the same parent of a given node.
is_root
Returns 1 if the node has no parent, and 0 otherwise.
is_inner
Returns 1 if the node has at least one child, and 0 otherwise.
is_branch
Returns 1 if the node has at least one child and is not a root node, 0 otherwise.
is_leaf
Returns 1 if the node has no child, and 0 otherwise.
available_mobius_index
Returns the smallest mobius index available in the subtree of a given node.
child_encoding
Given a mobius index, return the mobius a,b,c,d column values.
depth
Return the depth of a node in a tree (depth of a root node is 1).
make_root
Force a node to become a new tree root (if this node possess a subtree of descendants, it becomes a new tree).
CAVEATS AND LIMITATIONS
'left-right' maximum depth
All functions should work as expected until a tree reaches the 'left-right' maximum depth. That is to say 8 levels if you declared the two special columns 'lft' and 'rgt' as a SQL FLOAT, and 21 levels if you declared them as a SQL DOUBLE. In the default 'strict mode', the library will enforce this 'left-right' maximum and will die if you try to add a child deeper.
You may desactivate this check and allow Tree::Mobius to create nodes deeper than this maximum level.
__PACKAGE__->strict_mode( 0 );
In this relaxed mode, only the function 'children' and 'parent' will work correctly and you should not trust the results returned by 'descendants', 'leaves', 'inner_descendants', 'ancestors' for any node deeper than the maximum level.
Please also note that there is a bug in SQLite http://www.sqlite.org/src/tktview?name=1248e6cda8 that prevent use of more than 15 decimal precision FLOAT. A workaround consists of manually restricting the accuracy of float in Tree::Mobius after the add_mobius_tree_columns call :
Math::BigFloat->accuracy(15);
'mobius' maximum index
The Möbius representation (using 4 integers a,b,c,d) is limited by the maximum value of the type 'integer' of the corresponding columns in your SQL database. Specifically, this encoding only limits the number of inner nodes (nodes with at least one child) representable on the right side of the tree. The upper limit can be calculated using the least favorable Tree::Mobius inner node materialized path (with the highest index at each level, not counting leaves), either recursively or using matrix multiplication.
For example, a 4 levels depth tree with 5 inner nodes at level 1, 5 children inner nodes at level 2 attached to the rightmost level 1 node, and again 5 children inner nodes at level 3 attached to the rightmost level 2 node, the least favorable inner node materialized path is '5.5.5' The corresponding Tree::Mobius path is derived adding 2 to each index, thus '7.7.7'. The least favorable Möbius representation can now be calculated using the following matrix multiplication:
( 7 1 ) . ( 7 1 ) . ( 7 1 ) = ( 2549 357 )
( 1 0 ) ( 1 0 ) ( 1 0 ) ( 357 50 )
In our example, a=2549, b=357, c=357 and d=50 and all numbers are within the limits of the database INT or BIGINT type. Thus this tree can be encoded by Tree::Mobius.
Using this method, we can calculate the worst case inner node materialized path for the following inner node depth and the maximum value of a MySQL UNSIGNED INT, that is to say 4294967295.
- 2 levels : 1623.1623 (the 1623th level 1 inner node has maximum 1623th inner descendants)
- 3 levels : 253.253.253
- 4 levels : 82.82.82.82
- 5 levels : 38.38.38.38.38
- 6 levels : 21.21.21.21.21.21
- 7 levels : 13.13.13.13.13.13.13
The limits with MySQL UNSIGNED BIGINT (18446744073709551615) are :
- 3 levels : 2642243.2642243.2642243
- 4 levels : 65533.65533.65533.65533
- 5 levels : 7129.7129.7129.7129.7129
- 6 levels : 1623.1623.1623.1623.1623.1623
- 7 levels : 563.563.563.563.563.563
- 8 levels : 253.253.253.253.253.253.253.253
- 9 levels : 136.136.136.136.136.136.136.136.136
- 10 levels : 82.82.82.82.82.82.82.82.82
- 11 levels : 54.54.54.54.54.54.54.54.54.54.54
- 12 levels : 38.38.38.38.38.38.38.38.38.38.38.38
- 13 levels : 28.28.28.28.28.28.28.28.28.28.28.28.28
- 14 levels : 21.21.21.21.21.21.21.21.21.21.21.21.21.21
- 15 levels : 17.17.17.17.17.17.17.17.17.17.17.17.17.17.17
- 16 levels : 13.13.13.13.13.13.13.13.13.13.13.13.13.13.13.13
- 17 levels : 11.11.11.11.11.11.11.11.11.11.11.11.11.11.11.11.11
- 18 levels : 9.9.9.9.9.9.9.9.9.9.9.9.9.9.9.9.9.9
- 19 levels : 8.8.8.8.8.8.8.8.8.8.8.8.8.8.8.8.8.8.8
- 20 levels : 7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7
- 21 levels : 6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6.6
For all these scenarios, there are no constraint with the number of leaves at any level.
backward compatibility with experimental version
Finally, early testers should note that the encoding used since version 0.2000 is not compatible with the old encoding tested in experimental developper versions 0.00002_01 and 0.00001_04.
INTERNAL
The Möbius encoding (ax+b)/(cx+d) can represent a tree giving the following relationship between each parent node and it's nth child :
Parent encoding = [ Pa, Pb, Pc, Pd ]
Child encoding = [ Pa * n + Pc, Pb * n + Pd, Pa, Pb ]
Tree::Mobius can encode several trees using the convention that root nodes of these trees are in fact children of an abstract mathematic super root node (there will be no row in your database for it).
The Möbius represention of this super root node is (a, b, c, d) = ( 1, 0, 0, 1 )