NAME
Judy - Library for creating and accessing dynamic arrays
SEE ALSO
- http://judy.sourceforge.net - the C library home page
- Judy::1 - maps an integer to a bit
- Judy::L - maps an integer to an integer/pointer
- Judy::SL - maps a null terminated string to an integer/pointer
- Judy::HS - maps a string to an integer/pointer
DESCRIPTION
The Judy family of functions supports fully dynamic arrays. These arrays may be indexed by a 32- or 64-bit word (depending on processor word size), a null terminated string or an ordinary perl string.
A Word_t is a typedef unsigned long int in Judy.h and must be the same size as sizeof(void *) I.E. a pointer. This is implemented as the perl type IV.
For those rare-ducks where sizeof(IV)!=sizeov(Word_t), you can still use Judy. You'll be warned if you ever accidentally attempt to use something that requires data loss during truncation.
- Judy::1 functions:
-
This can be thought of as a huge bitmap. For a comparison between Judy::1 and "vec" in perlfunc, take a look at http://perlmonks.org/?node_id=732843.
- Judy::L functions:
-
Maps an integer/pointer to another integer/pointer. This is sort of like a very compact perl hash where the only allowed keys and values are integers. JudySL and JudyHL are based on JudyL.
- Judy::SL functions:
-
Maps null terminated strings to integers/pointers. The map is searchable and may be enumerated.
- Judy::HS functions:
-
Maps strings to integers/pointers. The map has no native support for searching and enumeration. See http://perlmonks.org/?node_id=733140 for an example program which enumerates the contents of JudyHS.
new addition (May 2004) to Judy is a hybrid using the best features of hashing and Judy methods. The C library author believes JudyHS is a good replacement for a hashing method when resizing the hash table is done during population growth. A correctly tuned hash method with a static hash table size and population is unbeatable for speed. However, L<Judy::HS> will perform better than a hashing method with smaller and larger populations than the optimum hash table size. JudyHS does not have a degenerate performance case where knowledge of the hash algorithm can be exploited. (I.E. JudyHS does not use a linked list to handle hash collisions, it uses a tree of JudyL arrays and a virtual hash table size of 4 billion).
Judy arrays are both speed- and memory-efficient, with no tuning or configuration required, across a wide range of key set types (sequential, periodic, clustered, random). Judy's speed and memory usage are typically better than other data storage models such as skiplists, linked lists, binary, ternary, b-trees, or even hashing, and improves with very large data sets.
A Judy array is created merely by defining a null pointer and then storing (inserting) the first element into the array under that pointer. The memory used by a Judy array is nearly proportional to the population (number of elements).
Since an initial (empty) Judy array is represented by a null pointer, it is possible to construct an array of Judy arrays. In other words, a Judy array's Values (except Judy::1) can be pointers to other Judy arrays. This makes it very simple to construct an array with an arbitrary number of dimensions or Index sizes. (JudySL and JudyHS are implemented using JudyL this way).
- A 10 MINUTE TECHNICAL DESCRIPTION
-
may be found at http://judy.sourceforge.net/downloads/10minutes.htm
- A 3 HOUR TECHNICAL DESCRIPTION (out of date and a bit corny)
-
may be found at http://judy.sourceforge.net/application/shop_interm.pdf
CONSTANTS
JLAP_INVALID
PJERR
Multi-dimensional Judy::L/Judy::SL/Judy::HS Arrays
Storing a pointer to another Judy::L array in a Judy::L array's Value is a simple way to support dynamic multi-dimensional Judy::L arrays. These arrays (or trees) built using Judy::L arrays are very fast and memory efficient. (In fact, that is how JudySL and JudyHS are implemented). An arbitrary number of dimensions can be realized this way. To terminate the number of dimensions (or tree), the Value pointer is marked to NOT point to another Judy array. A Judy::JLAP_INVALID
flag is used in the least significant bit(s) of the pointer. After the flag Judy::JLAP_INVALID
is removed, it is used as a pointer to the users data.
Note: The current version of Judy.h changed this flag from 0x4 to 0x1 to allow for a malloc() that does not deliver memory on an 8 byte aligned boundry (such as old v algrind).
The following example code segment can be used to dive into a multi-dimensional Judy::L using an API similar to Data::Diver. This makes a Judy::HS object and looks past the public API as an example of a multi-dimensional Judy::* structure.
# For kicks, allocate a Judy::HS object and look inside it a
# little bit.
use Judy::HS;
Judy::HS::Set( my ($judy), 'abcd', 42 );
Dive( $judy, 4 );
use Judy qw( JLAP_INVALID );
use Judy::L qw( Get );
sub Dive {
my ( $judy, @walk ) = @_;
my ( $pvalue, $value );
for my $key ( @walk ) {
return if ! $judy;
# Advance to next dimension.
( $pvalue, $value ) = Get( $judy, $key );
# Check if pointer to user buffer
last if $value & JLAP_INVALID;
$judy = $value;
}
if ( $value & JLAP_INVALID ) {
# Remove our flag.
$value &= ~ JLAP_INVALID;
# Return the value.
printf "User object pointer is 0x%x at 0x%x\n", $value, $pvalue;
}
else {
warn sprintf "Judy::* object pointer is 0x%x at 0x%x\n", $value, $pvalue;
}
return ( $pvalue, $value );
}
Note: This works because malloc() guarantees to return a pointer with the least bit(s) == 0x0. You must remove JLAP_INVALID before using the pointer.
FILES
Locations of interest include: http://sourceforge.net/projects/judy -- project downloads file:/usr/share/doc/Judy/ -- for HTML version of man pages. /usr/share/doc/Judy/demo/ -- demonstration program source files.
The author attempted to write interesting application notes using advanced features of Judy. They may be found at "http://judy.sourceforge.net/application/ (Some may be out of date).
ERRORS & WARNINGS
File '%s', line %d: %s(), JU_ERRNO_* == %d, ID == %d
See the header file Judy.h from the Judy C source library. You already have a local copy of this to have been able to build this perl library.
(UV)%d truncated to (Word_t)%d
If your perl is compiled for 64 bit numbers but is running on a 32 bit machine, numbers which cannot be represented within 32 bits will be truncated to 32 bits and this warning will be raised.
BUGS
Please report any bugs or feature requests to bug-Judy-HS at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Judy-HS. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Judy
You can also look for information at:
RT: CPAN's request tracker
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
Doug Baskins, totally.
Michael Schwern for writing Alien::SVN which made this possible.
Tye McQueen for inspiring the minimal API.
Yitzchak Scott-Thoennes for reminding me that perl's magic requires extra care and attention.
COPYRIGHT & LICENSE
Copyright 2008 Joshua ben Jore, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SOURCE AVAILABILITY
This source is in Github: git://github.com/jbenjore/judy-hs.git
AUTHOR
Judy was invented by Doug Baskins (dougbaskins .AT, yahoo.com) and implemented by Hewlett-Packard. (Note: Judy is named for the inventor's sister, after discarding many proposed names.)
The perl wrapper was written by Joshua ben Jore