NAME
Array::OverlapFinder - Find/remove overlapping items among ordered sequences
VERSION
This document describes version 0.005 of Array::OverlapFinder (from Perl distribution Array-OverlapFinder), released on 2020-01-02.
SYNOPSIS
use Array::OverlapFinder qw(
find_overlap
combine_overlap
);
# sequence is array of strings (compared with 'eq' operator; if you have array
# of records/structures, you can encode each record as JSON or using Data::Dmp,
# for example)
my @seq1 = qw(1 2 3 4 5 6);
my @seq2 = qw(4 5 6 7 8 9);
my @seq3 = qw(8 9 10 11);
my @overlap_items = find_overlap(\@seq1, \@seq2); # => (4,5,6)
my @all_overlap_items = find_overlap(\@seq1, \@seq2, \@seq3); # => ([4,5,6], [8,9])
my ($overlap_items_12, $index2_at_seq1, $overlap_items_13, $index3_at_seq1b) =
find_overlap({detail=>1}, \@seq1, \@seq2, \@seq3); # => ([4,5,6], 3, [8,9], 7)
my @combined_seq = combine_overlap(\@seq1, \@seq2, \@seq3); # => (1,2,3,4,5,6,7,8,9,10,11)
my ($combined_seq, $overlap_items_12, $index2_at_seq1, $overlap_items_13, $index3_at_seq1b) =
combine_overlap({detail=>1}, \@seq1, \@seq2, \@seq3);
# => ([1,2,3,4,5,6,7,8,9,10,11], [4,5,6], 3, [8,9], 7)
DESCRIPTION
Assuming you have two ordered sequences of items that might or might not overlap, where the first sequence contains "earlier" items and the second contains possibly "later" items, the functions in this module can find the overlapping items for you or remove them combining the two sequence into one:
# condition A, no overlaps
sequence1: 1 2 3 4 5 6
sequence2: 8 9 10
overlap :
combined : 1 2 3 4 5 6 8 9 10
# condition B, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 4 5 6 7 8 9
overlap : 4 5 6
combined : 1 2 3 4 5 6 7 8 9
# condition C, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 4 5
overlap : 4 5
combined : 1 2 3 4 5 6
# condition D, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 4 5 6
overlap : 4 5 6
combined : 1 2 3 4 5 6
# condition E, overlaps (identical)
sequence1: 1 2 3 4 5 6
sequence2: 1 2 3 4 5 6
overlap : 1 2 3 4 5 6
combined : 1 2 3 4 5 6
# condition F, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 1 2 3 4 5 6 7 8
overlap : 1 2 3 4 5 6
combined : 1 2 3 4 5 6 7 8
# condition G1, overlaps in the middle of second sequence will be assumed as non-overlapping
sequence1: 1 2 3 4 5 6
sequence2: 2 3 4 x x 5 6
overlap :
combined : 1 2 3 4 5 6 2 3 4 x x 5 6
# condition G2, multiple overlaps will be assumed as non-overlapping
sequence1: 1 2 3 4 5 6
sequence2: 2 3 4 x x 5 6 y y
overlap :
combined : 1 2 3 4 5 6 2 3 4 x x 5 6 y y
The functions can accept more than two sequences to find/remove overlapping items in.
Use-cases: forming a non-overlapping sequence of items from repeated downloads of RSS feed or "recent" page.
FUNCTIONS
All functions are not exported by default, but exportable.
find_overlap
Usage:
find_overlap([ \%opts , ] \@seq1, \@seq2, ...)
combine_overlap
Usage:
combine_overlap([ \%opts , ] \@seq1, \@seq2, ...)
HOMEPAGE
Please visit the project's homepage at https://metacpan.org/release/Array-OverlapFinder.
SOURCE
Source repository is at https://github.com/perlancar/perl-Array-OverlapFinder.
BUGS
Please report any bugs or feature requests on the bugtracker website https://github.com/perlancar/perl-Array-OverlapFinder/issues
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
SEE ALSO
nauniq from App::nauniq can also sometimes be used, if you know the items in the sequence are unique.
Text::OverlapFinder has a similar name, but the two modules are not that related.
AUTHOR
perlancar <perlancar@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2021, 2020 by perlancar@cpan.org.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.