NAME
MaxMind::DB::Writer::Tree - Tree representing a MaxMind DB database in memory - then write it to a file
VERSION
version 0.200003
SYNOPSIS
use MaxMind::DB::Writer::Tree;
my %types = (
color => 'utf8_string',
dogs => [ 'array', 'utf8_string' ],
size => 'uint16',
);
my $tree = MaxMind::DB::Writer::Tree->new(
ip_version => 6,
record_size => 24,
database_type => 'My-IP-Data',
languages => ['en'],
description => { en => 'My database of IP data' },
map_key_type_callback => sub { $types{ $_[0] } },
);
$tree->insert_network(
'2001:db8::/48',
{
color => 'blue',
dogs => [ 'Fido', 'Ms. Pretty Paws' ],
size => 42,
},
);
open my $fh, '>:raw', '/path/to/my-ip-data.mmdb';
$tree->write_tree($fh);
DESCRIPTION
This is the main class you'll use to write MaxMind DB database files. This class represents the database in memory. Once you've created the full tree you can write to a file.
API
This class provides the following methods:
MaxMind::DB::Writer::Tree->new()
This creates a new tree object. The constructor accepts the following parameters:
ip_version
The IP version for the database. It must be 4 or 6.
This parameter is required.
record_size
This is the record size in bits. This should be one of 24, 28, 32 (in theory any number divisible by 4 up to 128 will work but the available readers all expect 24-32).
This parameter is required.
database_type
This is a string containing the database type. This can be anything, really. MaxMind uses strings like "GeoIP2-City", "GeoIP2-Country", etc.
This parameter is required.
languages
This should be an array reference of languages used in the database, like "en", "zh-TW", etc. This is useful as metadata for database readers and end users.
This parameter is optional.
description
This is a hashref where the keys are language names and the values are descriptions of the database in that language. For example, you might have something like:
{ en => 'My IP data', fr => q{Mon Data d'IP}, }
This parameter is required.
map_key_type_callback
This is a subroutine reference that is called in order to determine how to store each value in a map (hash) data structure. See "DATA TYPES" below for more details.
This parameter is required.
merge_record_collisions
By default, when an insert collides with a previous insert, the new data simply overwrites the old data where the two networks overlap.
If this is set to true, then on a collision, the writer will merge the old data with the new data. The merge strategy employed is controlled by the
merge_strategy
attribute, described below.This parameter is optional. It defaults to false unless
merge_strategy
is set to something other thannone
.This parameter is deprecated. New code should just set
merge_strategy
directly.merge_strategy
Controls what merge strategy is employed.
none
No merging will be done.
merge_record_collisions
must either be not set or set to false.toplevel
If both data structures are hashrefs then the data from the top level keys in the new data structure are copied over to the existing data structure, potentially replacing any existing values for existing keys completely.
recurse
Recursively merges the new data structure with the old data structure. Hash values and array elements are either - in the case of simple values - replaced with the new values, or - in the case of complex structures - have their values recursively merged.
For example if this data is originally inserted for an IP range:
{ families => [ { husband => 'Fred', wife => 'Wilma', }, ], year => 1960, }
And then this subsequent data is inserted for a range covered by the previous IP range:
{ families => [ { wife => 'Wilma', child => 'Pebbles', }, { husband => 'Barney', wife => 'Betty', child => 'Bamm-Bamm', }, ], company => 'Hanna-Barbera Productions', }
Then querying within the range will produce the results:
{ families => [ { husband => 'Fred', wife => 'Wilma', # note replaced value child => 'Pebbles', }, { husband => 'Barney', wife => 'Betty', child => 'Bamm-Bamm', }, ], year => 1960, company => 'Hanna-Barbera Productions', }
In all merge strategies attempting to merge two differing data structures causes an exception.
This parameter is optional. If
merge_record_collisions
is true, this defaults totoplevel
; otherwise, it defaults tonone
.alias_ipv6_to_ipv4
If this is true then the final database will map some IPv6 ranges to the IPv4 range. These ranges are:
::ffff:0:0/96
This is the IPv4-mapped IPv6 range
2001::/32
This is the Teredo range. Note that lookups for Teredo ranges will find the Teredo server's IPv4 address, not the client's IPv4.
2002::/16
This is the 6to4 range
When aliasing is enabled, insertions into the aliased locations will throw an exception. To insert an IPv4 address, insert it using IPv4 notation or insert directly into ::/96.
Aliased nodes are not followed when merging nodes. Only merges into the original IPv4 location, ::/96, will be followed.
This parameter is optional. It defaults to false.
remove_reserved_networks
If this is true, reserved networks will be removed from the database by
write_tree()
before the tree is written to the file handle. Reserved networks that are globally routable to an individual device, such as Teredo, are not removed. The default is true.
$tree->insert_network( $network, $data, $additional_args )
This method expects two parameters. The first is a network in CIDR notation. The second can be any Perl data structure (except a coderef, glob, or filehandle).
The $data
payload is encoded according to the MaxMind DB database format spec. The short overview is that anything that can be encoded in JSON can be stored in an MMDB file. It can also handle unsigned 64-bit and 128-bit integers if they are passed as Math::UInt128 objects.
$additional_args
is a hash reference containing additional arguments that change the behavior of the insert. Currently, the only supported argument is force_overwrite
. This causes the object-wide merge_record_collisions
setting to be ignored for the insert, causing $data
to overwrite any existing data for the network.
Insert Order, Merging, and Overwriting
Depending on whether or not you set merge_record_collisions
to true in the constructor, the order in which you insert networks will affect the final tree output.
When merge_record_collisions
is false, the last insert "wins". This means that if you insert 1.2.3.255/32
and then 1.2.3.0/24
, the data for 1.2.3.255/24
will overwrite the data you previously inserted for 1.2.3.255/232
. On the other hand, if you insert 1.2.3.255/32
last, then the tree will be split so that the 1.2.3.0 - 1.2.3.254
range has different data than 1.2.3.255
.
In this scenario, if you want to make sure that no data is overwritten then you need to sort your input by network prefix length.
When merge_record_collisions
is true, then regardless of insert order, the 1.2.3.255/32
network will end up with its data plus the data provided for the 1.2.3.0/24
network, while 1.2.3.0 - 1.2.3.254
will have the expected data. This can be disabled on a per-insert basis by using the force_overwrite
argument when inserting a network as discussed above.
$tree->insert_range( $first_ip, $last_ip, $data, $additional_args )
This method is similar to insert_network()
, except that it takes an IP range rather than a network. The first parameter is the first IP address in the range. The second is the last IP address in the range. The third is a Perl data structure containing the data to be inserted. The final parameter are additional arguments, as outlined for insert_network()
.
$tree->remove_network( $network )
This method removes the network from the database. It takes one parameter, the network in CIDR notation.
$tree->write_tree($fh)
Given a filehandle, this method writes the contents of the tree as a MaxMind DB database to that filehandle.
$tree->iterate($object)
This method iterates over the tree by calling methods on the passed object. The object must have at least one of the following three methods: process_empty_record
, process_node_record
, process_data_record
.
The iteration is done in depth-first order, which means that it visits each network in order.
Each method on the object is called with the following position parameters:
The node number as a 64-bit number.
A boolean indicating whether or not this is the right or left record for the node. True for right, false for left.
The first IP number in the node's network as a 128-bit number.
The prefix length for the node's network.
The first IP number in the record's network as a 128-bit number.
The prefix length for the record's network.
If the record is a data record, the final argument will be the Perl data structure associated with the record.
The record's network is what matches with a given data structure for data records.
For node (and alias) records, the final argument will be the number of the node that this record points to.
For empty records, there are no additional arguments.
$tree->freeze_tree($filename)
Given a file name, this method freezes the tree to that file. Unlike the write_tree()
method, this method does write out a MaxMind DB file. Instead, it writes out something that can be quickly thawed via the MaxMind::DB::Writer::Tree->new_from_frozen_tree
constructor. This is useful if you want to pass the in-memory representation of the tree between processes.
$tree->ip_version()
Returns the tree's IP version, as passed to the constructor.
$tree->record_size()
Returns the tree's record size, as passed to the constructor.
$tree->merge_record_collisions()
Returns a boolean indicating whether the tree will merge colliding records, as determined by the merge strategy.
$tree->merge_strategy()
Returns the merge strategy used when two records collide.
$tree->map_key_type_callback()
Returns the callback used to determine the type of a map's values, as passed to the constructor.
$tree->database_type()
Returns the tree's database type, as passed to the constructor.
$tree->languages()
Returns the tree's languages, as passed to the constructor.
$tree->description()
Returns the tree's description hashref, as passed to the constructor.
$tree->alias_ipv6_to_ipv4()
Returns a boolean indicating whether the tree will alias some IPv6 ranges to their corresponding IPv4 ranges when the tree is written to disk.
MaxMind::DB::Writer::Tree->new_from_frozen_tree()
This method constructs a tree from a file containing a frozen tree.
This method accepts the following parameters:
filename
The filename containing the frozen tree.
This parameter is required.
map_key_type_callback
This is a subroutine reference that is called in order to determine how to store each value in a map (hash) data structure. See "DATA TYPES" below for more details.
This needs to be passed because subroutine references cannot be reliably serialized and restored between processes.
This parameter is required.
database_type
Override the
<database_type
> of the frozen tree. This accepts a string of the same form as the<new()
> constructor.This parameter is optional.
description
Override the
<description
> of the frozen tree. This accepts a hashref of the same form as the<new()
> constructor.This parameter is optional.
merge_strategy
Override the
<merge_strategy
> setting for the frozen tree.This parameter is optional.
Caveat for Freeze/Thaw
The frozen tree is more or less the raw C data structures written to disk. As such, it is very much not portable, and your ability to thaw a tree on a machine not identical to the one on which it was written is not guaranteed.
In addition, there is no guarantee that the freeze/thaw format will be stable across different versions of this module.
DATA TYPES
The MaxMind DB file format is strongly typed. Because Perl is not strongly typed, you will need to explicitly specify the types for each piece of data. Currently, this class assumes that your top-level data structure for an IP address will always be a map (hash). You can then provide a map_key_type_callback
subroutine that will be called as the data is serialized. This callback is given a key name and is expected to return that key's data type.
Let's use the following structure as an example:
{
names => {
en => 'United States',
es => 'Estados Unidos',
},
population => 319_000_000,
fizzle_factor => 65.7294,
states => [ 'Alabama', 'Alaska', ... ],
}
Given this data structure, our map_key_type_callback
might look something like this:
my %types = (
names => 'map',
en => 'utf8_string',
es => 'utf8_string',
population => 'uint32',
fizzle_factor => 'double',
states => [ 'array', 'utf8_string' ],
);
sub {
my $key = shift;
return $type{$key};
}
If the callback returns undef
, the serialization code will throw an error. Note that for an array we return a 2 element arrayref where the first element is 'array'
and the second element is the type of content in the array.
The valid types are:
utf8_string
uint16
uint32
uint64
uint128
int32
double
64 bits of precision.
float
32 bits of precision.
boolean
map
array
AUTHORS
Olaf Alders <oalders@maxmind.com>
Greg Oschwald <goschwald@maxmind.com>
Dave Rolsky <drolsky@maxmind.com>
Mark Fowler <mfowler@maxmind.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2016 by MaxMind, Inc.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.