NAME

MaxMind::DB::Writer::Tree - Tree representing a MaxMind DB database in memory - then write it to a file

VERSION

version 0.100001

SYNOPSIS

use MaxMind::DB::Writer::Tree;
use Net::Works::Network;

my $tree = MaxMind::DB::Writer::Tree->new(
    ip_version    => 6,
    record_size   => 24,
    database_type => 'My-IP-Data',
    languages     => ['en'],
    description   => { en => 'My database of IP data' },
);

my $network
    = Net::Works::Network->new_from_string( string => '8.23.0.0/16' );

$tree->insert_network(
    $network,
    {
        color => 'blue',
        dogs  => [ 'Fido', 'Ms. Pretty Paws' ],
        size  => 42,
    },
);

open my $fh, '>:raw', '/path/to/my-ip-data.mmdb';
$tree->write_tree($fh);

DESCRIPTION

This is the main class you'll use to write MaxMind DB database files. This class represents the database in memory. Once you've created the full tree you can write to a file.

API

This class provides the following methods:

MaxMind::DB::Writer::Tree->new()

This creates a new tree object. The constructor accepts the following parameters:

  • ip_version

    The IP version for the database. It must be 4 or 6.

    This parameter is required.

  • record_size

    This is the record size in bits. This should be one of 24, 28, 32 (in theory any number divisible by 4 up to 128 will work but the available readers all expect 24-32).

    This parameter is required.

  • database_type

    This is a string containing the database type. This can be anything, really. MaxMind uses strings like "GeoIP2-City", "GeoIP2-Country", etc.

    This parameter is required.

  • languages

    This should be an array reference of languages used in the database, like "en", "zh-TW", etc. This is useful as metadata for database readers and end users.

    This parameter is optional.

  • description

    This is hashref where the keys are language names and the values are descriptions of the database in that language. For example, you might have something like:

    {
        en => 'My IP data',
        fr => 'Mon Data de IP',
    }

    This parameter is required.

  • map_key_type_callback

    This is a subroutine reference that is called in order to determine how to store each value in a map (hash) data structure. See "DATA TYPES" below for more details.

    This parameter is optional.

  • merge_record_collisions

    By default, when an insert collides with a previous insert, the new data simply overwrites the old data where the two networks overlap.

    If this is set to true, then on a collision, the writer will merge the old data with the new data. This only works if both inserts provide a hashref for the data payload.

    This parameter is optional. It defaults to false.

  • alias_ipv6_to_ipv4

    If this is true then the final database will map some IPv6 ranges to the IPv4 range. These ranges are:

    • ::ffff:0:0/96

      This is the IPv4-mapped IPv6 range

    • 2001::/32

      This is the Teredo range. Note that lookups for Teredo ranges will find the Teredo server's IPv4 address, not the client's IPv4.

    • 2002::/16

      This is the 6to4 range

    This parameter is optional. It defaults to false.

$tree->insert_network( $network, $data )

This method expects two parameters. The first is a Net::Works::Network object. The second can be any Perl data structure (except a coderef, glob, or filehandle).

The $data payload is encoded according to the MaxMind DB database format spec. The short overview is that anything that can be encoded in JSON can be stored in an MMDB file. It can also handle unsigned 64-bit and 128-bit integers if they are passed as Math::UInt128 objects.

Insert Order, Merging, and Overwriting

Depending on whether or not you set merge_record_collisions to true in the constructor, the order in which you insert networks will affect the final tree output.

When merge_record_collisions is false, the last insert "wins". This means that if you insert 1.2.3.255/32 and then 1.2.3.0/24, the data for 1.2.3.255/24 will overwrite the data you previously inserted for 1.2.3.255/232. On the other hand, if you insert 1.2.3.255/32 last, then the tree will be split so that the 1.2.3.0 - 1.2.3.254 range has different data than 1.2.3.255.

In this scenario, if you want to make sure that no data is overwritten then you need to sort your input by network prefix length.

When merge_record_collisions is true, then regardless of insert order, the 1.2.3.255/32 network will end up with its data plus the data provided for the 1.2.3.0/24 network, while 1.2.3.0 - 1.2.3.254 will have the expected data.

$tree->write_tree($fh)

Given a filehandle, this method writes the contents of the tree as a MaxMind DB database to that filehandle.

$tree->iterate($object)

This method iterates over the tree by calling methods on the passed object. The object must have at least one of the following three methods: process_empty_record, process_node_record, process_data_record.

The iteration is done in depth-first order, which means that it visits each network in order.

Each method on the object is called with the following position parameters:

  • The node number as a 64-bit number.

  • A boolean indicating whether or not this is the right or left record for the node. True for right, false for left.

  • The first IP number in the node's network as a 128-bit number.

  • The prefix length for the node's network.

  • The first IP number in the record's network as a 128-bit number.

  • The prefix length for the record's network.

If the record is a data record, the final argument will be the Perl data structure associated with the record.

The record's network is what matches with a given data structure for data records.

For node (and alias) records, the final argument will be the number of the node that this record points to.

For empty records, there are no additional arguments.

$tree->freeze_tree($filename)

Given a file name, this method freezes the tree to that file. Unlike the write_tree() method, this method does write out a MaxMind DB file. Instead, it writes out something that can be quickly thawed via the MaxMind::DB::Writer::Tree->new_from_frozen_tree constructor. This is useful if you want to pass the in-memory representation of the tree between processes.

$tree->ip_version()

Returns the tree's IP version, as passed to the constructor.

$tree->record_size()

Returns the tree's record size, as passed to the constructor.

$tree->merge_record_collisions()

Returns a boolean indicating whether the tree will merge colliding records, as determined by the constructor parameter.

$tree->map_key_type_callback()

Returns the callback used to determine the type of a map's values, as passed to the constructor.

$tree->database_type()

Returns the tree's database type, as passed to the constructor.

$tree->languages()

Returns the tree's languages, as passed to the constructor.

$tree->description()

Returns the tree's description hashref, as passed to the constructor.

$tree->alias_ipv6_to_ipv4()

Returns a boolean indicating whether the tree will alias some IPv6 ranges to their corresponding IPv4 ranges when the tree is written to disk.

MaxMind::DB::Writer::Tree->new_from_frozen_tree()

This method constructs a tree from a file containing a frozen tree.

This method accepts the following parameters:

  • filename

    The filename containing the frozen tree.

    This parameter is required.

  • map_key_type_callback

    This is a subroutine reference that is called in order to determine how to store each value in a map (hash) data structure. See "DATA TYPES" below for more details.

    This needs to be passed because subroutine references cannot be reliably serialized and restored between processes.

    This parameter is required.

  • database_type

    Override the <database_type> of the frozen tree. This accepts a string of the same form as the <new()> constructor.

    This parameter is optional.

  • description

    Override the <description> of the frozen tree. This accepts a hashref of the same form as the <new()> constructor.

    This parameter is optional.

Caveat for Freeze/Thaw

The frozen tree is more or less the raw C data structures written to disk. As such, it is very much not portable, and your ability to thaw a tree on a machine not identical to the one on which it was written is not guaranteed.

In addition, there is no guarantee that the freeze/thaw format will be stable across different versions of this module.

DATA TYPES

The MaxMind DB file format is strongly typed. Because Perl is not strongly typed, you will need to explicitly specify the types for each piece of data. Currently, this class assumes that your top-level data structure for an IP address will always be a map (hash). You can then provide a map_key_type_callback subroutine that will be called as the data is serialized. This callback is given a key name and is expected to return that key's data type.

Let's use the following structure as an example:

{
    names => {
        en => 'United States',
        es => 'Estados Unidos',
    },
    population    => 319_000_000,
    fizzle_factor => 65.7294,
    states        => [ 'Alabama', 'Alaska', ... ],
}

Given this data structure, our map_key_type_callback might look something like this:

my %types = (
    names         => 'map',
    en            => 'utf8_string',
    es            => 'utf8_string',
    population    => 'uint32',
    fizzle_factor => 'double',
    states        => [ 'array', 'utf8_string' ],
);

sub {
    my $key = shift;
    return $type{$key};
}

If the callback returns undef, the serialization code will throw an error. Note that for an array we return a 2 element arrayref where the first element is 'array' and the second element is the type of content in the array.

The valid types are:

  • utf8_string

  • uint16

  • uint32

  • uint64

  • uint128

  • int32

  • double

    64 bits of precision.

  • float

    32 bits of precision.

  • boolean

  • map

  • array

AUTHORS

  • Olaf Alders <oalders@maxmind.com>

  • Greg Oschwald <goschwald@maxmind.com>

  • Dave Rolsky <drolsky@maxmind.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2015 by MaxMind, Inc..

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.