The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.


Data::Unique - Module to check for duplicate item with time expiration and disk persistence.


Version 0.02


Create a data structure that avoid duplicate entries (key) whith any data and add expiration time to clean old entries. This module use Storable::AMF0 for the persistence. After some benchmark of various serialisation it is best compromise in read and write for huge quantity of data.


        use strict;
        use warnings;
        use Data::Dumper;
        use feature qw( say );
        use Time::HiRes qw(gettimeofday usleep );
        use Data::Unique;
        my $filename = '/tmp/dedup.test';
        my @dup;
        my $dedup = Data::Unique->new( { expiration => 10, file => $filename, gc => 5 } );
        for my $idx ( 1 .. 6 ) {
            my ( $seconds, $microseconds ) = gettimeofday;
            my $time = ( $seconds * 1000000 ) + $microseconds;
            say "$idx -> $time";
            $dedup->item( $time, { T => $idx } ) or say "no insertion ($$time already present)";
            push @dup, $time if ( ( $idx % 2 ) == 0 );
            usleep 10;
        say Data::Dumper::Dumper $dedup;
        say "Number of item=".$dedup->scalar;
        say "Expiration time ".$dedup->expiration;
        say "Number of item=".$dedup->scalar;
        say $dedup->expiration(6);
        sleep 15;
        #say "deleted item number=".$dedup->gc();
        say "Number of item=".$dedup->scalar;
        foreach my $ins (@dup) {
           say  $dedup->item($ins, { T => time }) ? "inserting $ins" : "no insertion ($ins already present)";
        say "Expiration time ".$dedup->expiration;
        say "Number of item=".$dedup->scalar. '  =>  '.scalar( keys( %{ $dedup->{data} }));



Create a new Data::Unique object. It is possible to set the default values as parameters

    my $dedup = Data::Unique->new( 
                                  expiration => 60,  # the retention time. When reached the expiration time, the item is removed
                                  file => $filename, # the file used for the retention
                                  gc => 5            # the number of operation between garbage colletor (checking the expiration time)


Add item and return 1 if succeed or return 0 if the item is already present; The key to test for unicity is the first parameter The second parameter is the data.

    $dedup->item( $time, $data );

If no data is provided, only test is the item is present.

    $dedup->item( $time );


Check or modify the expiration time (if a parameter is provided) If the expiration is modified, the garbage colletor run.

    $dedup->expiration(6);     # set the new expiration to 6 seconds
    $exp = $dedup->expiration; # return the current expiration time


Return the number of item

    $nbr = $dedup->scalar;

A convenient way to do:

    scalar keys scalar keys %{ $self->{data} };


Run the garbage collector to remove the expired item or modify the gc value if a paramter is provided. When the garbage collector is run, a sync to disk is executed. The garbage collector run each time the number item() action is reaching the value of the parameter gc If the value is 0, no automatic garbage collector is run. If the value < 0, this value is used as a expiration time when manually running the garbage collector.

    $dedup->gc();   # force the garbage collector to run;
    $dedup->gc(10); # change the gc value;


Write the data on disk. The sync is always done when the gc() run. It is possible to run it (if the gc occurence is too high)



DULAUNOY Fabrice, <fabrice at>


Please report any bugs or feature requests to bug-data-unique at, or through the web interface at I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.


add more test add a delete method maybe TIE support


You can find documentation for this module with the perldoc command.

    perldoc Data::Unique

You can also look for information at:



This software is Copyright (c) 2019 by DULAUNOY Fabrice.

This is free software, licensed under:

  The Artistic License 2.0 (GPL Compatible)