NAME

Hash::Util::Join - SQL-inspired operations for hash manipulation

SYNOPSIS

   %result = hash_inner_join %x, %y;            # Keys in both hashes
   %result = hash_left_join %x, %y;             # All keys from left hash
   %result = hash_right_join %x, %y;            # All keys from right hash
   %result = hash_outer_join %x, %y;            # All keys from both hashes
   
   %result = hash_left_anti_join %x, %y;        # Keys only in left hash
   %result = hash_right_anti_join %x, %y;       # Keys only in right hash
   %result = hash_full_anti_join %x, %y;        # Keys in either but not both

   @result = hash_partition %h, sub { ... }     # Partitions a hash into two
   %result = hash_partition_by %h, sub { ... }  # Partitions hash entries into buckets

DESCRIPTION

Hash::Util::Join provides SQL-inspired operations for hash manipulation, including joins (combining two hashes) and partitioning operations (organizing a single hash).

FUNCTIONS

Join Operations

hash_inner_join

%result = hash_inner_join %x, %y;
%result = hash_inner_join %x, %y, sub { ... };
$result = hash_inner_join %x, %y, sub { ... };

Returns keys present in both hashes. The merge function receives ($key, $x_value, $y_value) for each common key.

Default: sub { $_[2] } (right value)

Returns key-value pairs in list context, hash reference in scalar context.

hash_left_join

%result = hash_left_join %x, %y;
%result = hash_left_join %x, %y, sub { ... };
$result = hash_left_join %x, %y, sub { ... };

Returns all keys from the left hash. The merge function receives ($key, $x_value, $y_value) where $y_value is undef for keys not in the right hash.

Default: sub { $_[2] // $_[1] } (right // left)

Returns key-value pairs in list context, hash reference in scalar context.

hash_right_join

%result = hash_right_join %x, %y;
%result = hash_right_join %x, %y, sub { ... };
$result = hash_right_join %x, %y, sub { ... };

Returns all keys from the right hash. The merge function receives ($key, $x_value, $y_value) where $x_value is undef for keys not in the left hash.

Default: sub { $_[2] // $_[1] } (right // left)

Returns key-value pairs in list context, hash reference in scalar context.

hash_outer_join

%result = hash_outer_join %x, %y;
%result = hash_outer_join %x, %y, sub { ... };
$result = hash_outer_join %x, %y, sub { ... };

Returns all keys from both hashes. The merge function receives ($key, $x_value, $y_value) where either value may be undef for keys present in only one hash.

Default: sub { $_[2] // $_[1] } (right // left)

Returns key-value pairs in list context, hash reference in scalar context.

hash_left_anti_join

%result = hash_left_anti_join %x, %y;
$result = hash_left_anti_join %x, %y;

Returns keys present in the left hash but not in the right hash. No merge function - values come directly from the left hash.

Returns key-value pairs in list context, hash reference in scalar context.

hash_right_anti_join

%result = hash_right_anti_join %x, %y;
$result = hash_right_anti_join %x, %y;

Returns keys present in the right hash but not in the left hash. No merge function - values come directly from the right hash.

Returns key-value pairs in list context, hash reference in scalar context.

hash_full_anti_join

%result = hash_full_anti_join %x, %y;
$result = hash_full_anti_join %x, %y;

Returns keys present in either hash but not in both (symmetric difference). No merge function - values come from whichever hash contains the key.

Returns key-value pairs in list context, hash reference in scalar context.

Partition Operations

hash_partition

($true_hash, $false_hash) = hash_partition %hash, sub { ... };

Partitions a hash into two hashes based on a predicate function. The predicate receives ($key, $value) for each entry and returns a boolean.

Returns a list of two hash references: entries where the predicate returned true, and entries where it returned false.

%users = (
  1 => { name => 'Alice', active => 1 },
  2 => { name => 'Bob',   active => 0 },
  3 => { name => 'Carol', active => 1 },
);

($active, $inactive) = hash_partition %users, sub {
  my ($id, $user) = @_;
  return $user->{active};
};

# $active:   { 1 => {...}, 3 => {...} }
# $inactive: { 2 => {...} }

hash_partition_by

%result = hash_partition_by %hash, sub { ... };
$result = hash_partition_by %hash, sub { ... };

Partitions hash entries into buckets based on a classification function. The classifier receives ($key, $value) and returns a bucket name. Entries with undefined bucket names are skipped.

Returns key-value pairs in list context, hash reference in scalar context.

%employees = (
  1 => { name => 'Alice', dept => 'Eng',   salary => 100000 },
  2 => { name => 'Bob',   dept => 'Sales', salary =>  90000 },
  3 => { name => 'Carol', dept => 'Eng',   salary => 110000 },
);

# Partition by department
%by_dept = hash_partition_by %employees, sub {
  my ($id, $emp) = @_;
  return $emp->{dept};
};

# Result:
# {
#   Eng => {
#     1 => { name => 'Alice', dept => 'Eng',   salary => 100000 },
#     3 => { name => 'Carol', dept => 'Eng',   salary => 110000 },
#   },
#   Sales => {
#     2 => { name => 'Bob',   dept => 'Sales', salary =>  90000 },
#   }
# }

EXAMPLES

Basic Usage

%users  = (1 => 'Alice', 2 => 'Bob', 3 => 'Charlie'        );
%scores = (              2 => 95,    3 => 87,       4 => 92);

# Inner join - users with scores
%result = hash_inner_join %users, %scores, sub {
  my ($id, $name, $score) = @_;
  return { name => $name, score => $score };
};
# Result:  
#   (2 => {name => 'Bob',     score => 95}, 
#    3 => {name => 'Charlie', score => 87})

# Left join - all users, with scores if available
%result = hash_left_join %users, %scores, sub {
  my ($key, $name, $score) = @_;
  return "$name: $score"    if defined $score;
  return "$name: no score";
};
# Result: 
#   (1 => 'Alice: no score', 
#    2 => 'Bob: 95', 
#    3 => 'Charlie: 87')

# Left anti join - users without scores
%result = hash_left_anti_join %users, %scores;
# Result: 
#   (1 => 'Alice')

Merging Nested Structures

%users = (
  1 => { name => 'Alice', age => 30 },
  2 => { name => 'Bob',   age => 25 },
);

%scores = (
  1 => { math => 95, english => 88 },
  2 => { math => 87, english => 92 },
);

%result = hash_inner_join %users, %scores, sub {
  my ($id, $user, $score) = @_;
  return { %$user, %$score };
};

# Output:
#   1 => { name => 'Alice', age => 30, math => 95, english => 88 }
#   2 => { name => 'Bob',   age => 25, math => 87, english => 92 }

Deep Merging

sub deep_merge {
  my ($key, $left, $right) = @_;
  
  if (ref $left eq 'HASH' && ref $right eq 'HASH') {
      return { hash_outer_join %$left, %$right, \&deep_merge };
  }
  return $right // $left;
}

%system_defaults = (
  database => {
    host => 'localhost',
    port => 5432,
    pool => { min => 5, max => 20 },
  },
  logging => {
    level  => 'info',
    output => { file => '/var/log/app.log', console => 1 },
  },
);

%user_config = (
  database => {
    host => 'prod.example.com',
    pool => { min => 10 },
  },
  logging => {
    output => { console => 0 },
  },
);

%config = hash_outer_join %system_defaults, %user_config, \&deep_merge;

# Result:
#   database => {
#     host => 'prod.example.com',      # overridden
#     port => 5432,                    # from defaults
#     pool => {
#       min => 10,                     # deeply merged and overridden
#       max => 20,                     # deeply merged from defaults
#     }
#   }
#   logging => {
#     level => 'info',                  # from defaults
#     output => { 
#       file    => '/var/log/app.log',  # deeply merged from defaults
#       console => 0,                   # deeply merged and overridden
#     }
#   }

Chaining Joins

%products = (1 => 'Widget', 2 => 'Gadget', 3 => 'Gizmo'             );
%prices   = (               2 => 19.99,    3 => 29.99,    4 => 39.99);
%stock    = (1 => 10,       2 => 50                                 );

# Find products with price
%catalog = hash_inner_join %products, %prices, sub {
  return { name => $_[1], price => $_[2] }
};
# Result: 
#   (2 => { name => 'Gadget', price => 19.99 },
#    3 => { name => "Gizmo",  price => 29.99 })

# Find available products with price in stock
%available = hash_inner_join %catalog, %stock, sub {
  my ($id, $info, $qty) = @_;
  return { %$info, stock => $qty };
};
# Result: 
#   (2 => { name => 'Gadget', price => 19.99, stock => 50 })

# Or find what's missing at each step
%no_price = hash_left_anti_join %products, %prices;  # (1 => 'Widget')
%no_stock = hash_left_anti_join %catalog, %stock;    # (3 => {...})

Numeric Operations

%q1_sales = (alice => 15000, bob => 22000, carol => 18000                );
%q2_sales = (                bob => 25000, carol => 19000, dave  => 21000);

# Total sales for employees in both quarters
%both_quarters = hash_inner_join %q1_sales, %q2_sales, sub {
  return $_[1] + $_[2]
};
# Result: (bob => 47000, carol => 37000)

# All employee totals
%all_totals = hash_outer_join %q1_sales, %q2_sales, sub {
  return ($_[1] // 0) + ($_[2] // 0)
};
# Result: (alice => 15000, bob => 47000, carol => 37000, dave => 21000)

# Best performance per employee
%best_quarter = hash_inner_join %q1_sales, %q2_sales, sub {
  my ($name, $q1, $q2) = @_;
  return $q1 > $q2 ? { quarter => 'Q1', sales => $q1 }
                   : { quarter => 'Q2', sales => $q2 };
};
# Result: (bob => {quarter => 'Q2', sales => 25000}, ...)

TIPS

Always Use return or + in Merge Functions

Perl cannot distinguish between a block and a hash reference constructor when it appears at the end of a subroutine. Always use return or prefix with + to avoid ambiguity.

Incorrect:

%result = hash_inner_join %x, %y, sub {
  my ($k, $left, $right) = @_;
  { %$left, %$right }          # This is a BLOCK, not a hashref!
};

Correct - using return:

%result = hash_inner_join %x, %y, sub {
  my ($k, $left, $right) = @_;
  return { %$left, %$right };   # Explicit return
};

Correct - using unary plus:

%result = hash_inner_join %x, %y, sub {
  my ($k, $left, $right) = @_;
  +{ %$left, %$right }          # Unary + forces hashref context
};

Distinguishing undef values from missing keys

In cases where it is necessary to distinguish undef values from missing keys, the merge function can access the original hashes via a closure to check key existence.

  %x = (a => 1, b => undef);
  %y = (a => 2, c => 3);

  %result = hash_left_join %x, %y, sub {
    my ($k, $left, $right) = @_;

    if (exists $y{$k}) {
      return "from y: " . ($right // 'undef');
    } else {
      return "from x: " . ($left // 'undef');
    }
  };
  # Result: (a => 'from y: 2', b => 'from x: undef')

Modifying Input Hashes During Merge

All join functions retrieve their operation keys before invoking merge functions. This means it is safe to modify either input hash during the merge function execution without affecting the join operation itself.

%x = (a => 1,  b => 2, c => 3);
%y = (a => 10, b => 20);

%result = hash_inner_join %x, %y, sub {
  my ($k, $left, $right) = @_;
  
  # Safe: Add new keys to input hashes
  $x{new_key} = 99;
  delete $y{b};
  
  # Safe: Modify existing values
  $x{c} = 999;
  
  return $left + $right;
};

# Result: (a => 11, b => 22)

The key list is determined before iteration begins, so modifications to %x or %y during the merge function don't affect which keys are processed.

This guarantee holds for both the Pure Perl (PP) and XS implementations.

EXPORTS

Nothing is exported by default. Functions can be imported individually or by category:

use Hash::Util::Join qw(hash_inner_join hash_left_join); # individual functions
use Hash::Util::Join qw(:all);                           # all functions
use Hash::Util::Join qw(:joins);                         # join operations
use Hash::Util::Join qw(:partition);                     # partitioning operations

Export Tags

:all

All functions including join and partition operations.

:joins

Join operations: hash_inner_join, hash_left_join, hash_right_join hash_outer_join, hash_left_anti_join, hash_right_anti_join, hash_full_anti_join.

:partition

Partition operations: hash_partition, hash_partition_by.

SEE ALSO

Hash::Util::Set - Set operations on hash keys

AUTHOR

Christian Hansen <chansen@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2026 Christian Hansen

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.