NAME
Data::Cmp - Compare two data structures, return -1/0/1 like cmp
VERSION
This document describes version 0.001 of Data::Cmp (from Perl distribution Data-Cmp), released on 2018-08-10.
SYNOPSIS
use Data::Cmp qw(cmp_data);
cmp_data(["one", "two", "three"],
["one", "two", "three"]); # => 0
cmp_data(["one", "two" , "three"],
["one", "two2", "three"]); # => -1
cmp_data(["one", "two", "three"],
["one", "TWO", "three"]); # => 1
# case insensitive string comparison
cmp_data(["one", "two", "three"],
["one", "TWO", "three"], {ci=>1}); # => 0
# approximate number comparison
cmp_data([1, 1.5 , 1.6],
[1, 1.49999, 1.6], {epsilon=>1e-4}); # => 0
cmp_data(["one", "two", {}],
["one", "TWO", "three"]); # => 1
# hash/array is not "comparable" with scalar
cmp_data(["one", "two", {}],
["one", "two", "three"]); # => 2
# so is hash and array
cmp_data([],
{}); # => 2
# custom comparison function: always return the same
cmp_data(["one" , "two", "three"],
["satu", "dua", 3], {elem_cmp=>sub {0}}); # => 0
# custom comparison function: compare length ("satu" is longer than "one")
cmp_data(["one" , "two", "three"],
["satu", "dua", "tiga" ], {elem_cmp=>sub { length $_[0] <=> length $_[1] }}); # => -1
DESCRIPTION
This module offers the cmp_data
function that can compare two data structures in a flexible manner. The function can return a ternary value -1/0/1 like Perl's cmp
or <=>
operator (or another value 2, if the two data structures differ but there is no sensible notion of which one is larger than the other).
This module can handle circular structure.
This module offers an alternative to Test::Deep (specifically, Test::Deep::NoDeep's is_deeply()
). Test::Deep allows customizing comparison on specific points in a data structure, while Data::Cmp's cmp_data()
is more geared towards customizing comparison behavior across all points in a data structure. Depending your needs, one might be more convenient than the other.
For basic customization, you can turn on case-sensitive matching or numeric tolerance. For more advanced customization, you can provide coderefs to perform comparison of data items yourself.
FUNCTIONS
cmp_data
Usage:
cmp_data($d1, $d2 [ , \%opts ]) => -1|0|1|2
Compare two data structures $d1
and $d2
recursively. Like the cmp
operator, will return either: 0 if the two structures are equivalent, -1 if $d1
is "less than" $d2
, 1 if $d1
is "greater than" $d2
. Unlike the cmp
operator, can also return 2 if $d1
and $d2
differ but there is no sensible notion of which one is "greater than" the other.
Can detect recursive references.
Default behavior when comparing different types of data:
Two undef values are the same (0)
Defined value is greater than undefined value
cmp_data(undef, 0); # -1
Two numbers will be compared using Perl's
<=>
operatorWhether data is a number will be determined using Scalar::Util's
looks_like_number
.cmp_data("10", 9); # 1
Strings or number vs string will be compared using Perl's
cmp
operatorcmp_data("a", "2b"); # 1
Two arrays will be compared element by element
If all elements are the same until the last element of the shorter array, the longer array is greater than the shorter one.
cmp_data([1,2,3], [1,3,2]); # -1 cmp_data([1,2,3], [1,2]); # 1 cmp_data([1,2,3], [1,2,3,0]); # -1
Two hashes will be compared key by key (sorted ascibetically)
If after all common keys are compared all values are the same, the hash with more extra keys are greater than the other one; if they have the same number of extra keys, they are different; if they both have no extra keys, they are the same.
cmp_data({a=>1, b=>2}, {a=>1, b=>2}); # 0 cmp_data({a=>1, b=>2}, {a=>1, b=>3}); # -1 cmp_data({a=>1, b=>2}, {a=>1}); # 1 cmp_data({a=>1, b=>2}, {a=>1, c=>1}); # 2 cmp_data({a=>1, b=>2}, {a=>1, c=>1, d=>1}); # -1
All other combination will result in either 0 (same) or 2 (different)
Known options:
ci
Boolean. Can be set to true to turn on case-insensitive string comparison.
tolerance
Float. Can be set to perform numeric comparison with some tolerance.
cmp
Coderef. Can be set to provide custom comparison routine.
The coderef will be called for every data item (container included e.g. hash and array, before diving down to their items) and given these arguments:
($item1, $item2, \%context)
Context contains these keys:
depth
(int, starting from 0 from the topmost level).Must return 0, -1, 1, or 2. You can also return undef if you want to decline doing comparison. In that case,
cmp_data()
will use its default comparison logic.When using this option,
ci
andtolerance
options do not take effect.elem_cmp
Coderef. Just like
cmp
option, except this routine will only be consulted for array elements or hash pair value.num_cmp
Coderef. Just like
cmp
option, except this routine will only be consulted two compared two defined numbers.str_cmp
Coderef. Just like
cmp
option, except this routine will only be consulted two compared two defined strings.
HOMEPAGE
Please visit the project's homepage at https://metacpan.org/release/Data-Cmp.
SOURCE
Source repository is at https://github.com/perlancar/perl-Data-Cmp.
BUGS
Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=Data-Cmp
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
SEE ALSO
Modules that just return boolean result ("same or different"): Data::Compare, Test::Deep::NoTest (offers flexibility or approximate or custom comparison).
Modules that return some kind of "diff" data: Data::Comparator, Data::Diff.
Of course, to check whether two structures are the same you can also serialize each one then compare serialized strings/bytes. There are many modules for serialization: JSON, YAML, Sereal, Data::Dumper, Storable, Data::Dmp, just to name a few.
AUTHOR
perlancar <perlancar@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2018 by perlancar@cpan.org.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.