NAME

HTTP::State - Fast RFC6265bis-draft + CHIPS Cookie Jar

SYNOPSIS

use HTTP::State ":flags";
 
# Create a jar. Default flags for compatibility API.
#
my $jar=HTTP::State->new(default_flags=>FLAG_TYPE_HTTP|FLAG_TOP_LEVEL|...);

# Requested URL
#
my $request_url="http://test.example.com";

# First party site of the request ( for CHIPS )
#
my $partition_key= "https://first.party.com";

# User agent indicates the context of the request using flags
#
my $flags=FLAGS_TYPE_HTTP|FLAGS_TOP_LEVEL|...;



# Retrieve encoded cookies name/values applicable for the request.
#
my $cookie_header = $jar->retrieve_cookies($request_url, $partition_key, $flags);

#       ++ OR ++

# use the  HTTP::CookieJar compatible API, jar's default flags
#
my $cookie_header = $jar->cookie_header($request_url);




# Do a request...
#
my $response=user_agent->get(cookie_header=>$cookie_header);




# Store the  Set-Cookies in the jar for the request url
#
$jar->store_cookies($request_url, $partition_key, $flags, $response->header->{Set_Cookie});

#     ++ OR ++

# use the HTTP::CookieJar compatible API, using the default flags
$jar->add($request_url, $response->header->{Set_Cookie});

DESCRIPTION

Web browsers are constantly improving privacy and limiting the tracking ability of cookies. This module implements a cookie jar providing the same up to date thinking to Perl user-agents.

The algorithm implemented is derived from RFC6265bis-draft-12 and Cookies Having Independent Partitioned State (CHIPS). These are not finalised standards, so it is to be expected this module may change behaviour to keep up to date.

To aid adoption, this module will work as a drop in alternative for HTTP::CookieJar. A compatible API has been implemented to allow existing user agents to benefit from the performance gains in particular.

However long term, user agents would need to be modified, or at least wrapped in order to take full advantage of this module. Not only are there differences in the general API compared to popular cookie jars, but conceptually the notion of 'browsing context', and 'first-party/third-party' need to be implemented.

Finally, for explicit encoding and decoding of cookie strings (not via a cookie jar), please see the companion HTTP::State::Cookie module. This is used internally for encoding and decoding cookies.

Summary of RFC6265bis and CHIPS

At the time of writing RFC6265bis-draft is a version '12'. As the RFC progresses this module will be updated accordingly. The highlight take away feature implemented/utilized in this module are:

Public suffix checking

Cookies are tested for public suffixes (ie .com, .org, .com.au) and rejected if the cookie domain is not below such a level. This prevents a trackers and information leakage at a basic level.

Cookies names prefixed with __Host- and __Secure- are subject to addition requirements before being stored.

Reduced pper limit of expiry dates

Cookies expire 400 days (or user defined) in the future, instead of potentially years.

Browsing Context

User agents should be performing additional book keeping and testing of what is a 'same site' request, top level navigation, reloading, document etc to fully utilise the additional functionality available in this module.

While these tests/conditions are specified in RFC6265bis-draft, it is not in the scope of this module and needs to be implemented in a user-agent. The results of such tests are represented as bit fields, which are used to guide the store/retrieve algorithm accordingly within this module:

samesite/crosssite
API/HTTP
safe/unsafe method
top/nontop level navigation
Partitioning (CHIPS)

Cookies Having Independent Partitioned State (CHIPS), effectively gives each first party site it's own cookie jar. Third party cookies set from resources requested by the first party site are stored in this 'partitioned cookie jar'.

These cannot be used by another first party site requesting resources from the same third party site. Hence limiting tracking.

COMPATIBILITY

A compatibility interface matching that of HTTP::CookieJar is available to aid in adoption.

In theory, this should allow user-agents like HTTP::Tiny and Furl for example to benefit from performance and security improvements with limited changes to existing code.

To work around the lack of same site / browsing context / partitioning support in the HTTP::CookieJar API, the compatibility API utilises the current values of the 'default flags' for the cookie jar and disables partitioning by default

API

User Agent Context Flags

Flags are implemented as constants and can be exported using the ":flags" tag at import:

use HTTP::State ":flags";

They are a space efficient method of describing the intent and context of a request initiated by a user-agent. Multiple flags are ORed together into a bit field. This bit field is required for storing and retrieval of cookies throughout this API.

As mentioned previously, the user-agent or other external code is responsible for performing tests on the context. The flags are for conveying the results to the cookie jar.

FLAG_SAME_SITE

When this flag is set, a request is considered "same-site". When unset, request is considered "cross-site".

A same site request is one which loads resources for a document from the same host and scheme as the current 'document' for example.

FLAG_TYPE_HTTP

When this flag is set, request is considered "HTTP". When unset, request is considered "non-HTTP".

In a browser this represents if a request for a resource was initiated by from a html document (HTTP) or from javascript (non-HTTP).

FLAG_SAFE_METH

When this flag is set, request is considered "safe". When unset, request is considered "unsafe".

POST and PUT methods in HTTP are considered unsafe.

FLAG_TOP_LEVEL

When this flag is set, request is considered "top level". When unset, request is considered "not-top-level".

Top level requests include manually typing in a URL and navigating to it and

new

my $jar=HTTP::State->new(...);

Creates a new cookie jar object. Optional named arguments can be provided:

default_flags
my $jar=HTTP::State->new(default_flags=>flags);

Sets the default flags used for storing and retrieving cookies, when no defined value is provided via retrieve_cookies and store_cookies

It is also the value used in the compatibility API.

Default is all flags on

(FLAG_TYPE_HTTP|FLAG_TOP_LEVEL|FLAG_SAME_SITE|FLAG_SAFE_METH).

enable_partition
my $jar=HTTP::State->new(enable_partition=>...);

Enables partition support for partition cookies. When enabled, cookies are stored in relevant partitions.

The default value is false (off)

retrieve_sort
my $jar=HTTP::State->new(retrieve_sort=>1||0);

A boolean enabling sorting of retrieved cookies by path length and creation time.

Default is false (no sorting).

max_expiry
my $jar=HTTP::State->new(max_expiry=>seconds);

The upper limit in duration a cookie can be valid for. Value is in seconds.

Default is 400 days (400 * 24 * 3600)

lax_allowing_unsafe
my $jar=HTTP::State->new(lax_allowing_unsafe=>1||0)

A boolean enabling retrieval of cookies for unsafe methods and default same site, as long as cookie is no older than lax_allow_unsafe_timeout

The default value is false.

lax_allowing_unsafe_timeout
my $jar=HTTP::State->new(lax_allowing_unsafe_timeout=>seconds)

The timeout value (age) to use when testing the lax_allow_unsafe_timeout.

The default value is 120 seconds

public_suffix_sub
my $jar=HTTP::State->new(public_suffix_sub=>sub {...});

A code reference which performs public suffix lookup on a URI. The code ref must take a domain name as an argument and return a suffix or empty string string if no suffix exists.

If this option is not provided, a reference to the public_suffix routine from Mozilla::PublicSuffix is used.

Storing and Retrieving

For new code, these methods are preferred over the compatibility API, to make better use of 'browsing context'.

store_cookies

$jar->store_cookies($request_uri, $partition, $flags, $string_or_struct, ...);

Takes a $request_url , browsing context $flags and one or more Set-Cookie header string values or cookie structures. Stores them in the cookie jar as per the 'storage model' of RFC6265bis-draft.

The exact processing of the cookies is subject to the $flags bit field, which is a combination of the 'context flags'. If set to undef the current default bit flags for the cookie jar will be used.

The $partition argument, is the partition hey scheme://host) of the first party site. Cookies are stored to this partition if the jar was instantiated with the enable_partition option.

This method in intended to be called from a user-agent on receipt of a HTTP response.

retrieve_cookies

$jar->retrieve_cookies($request_url, $partition, $flags); 

Retrieves cookies from a jar, for the specified $request_url according to RFC6265bis-draft 'cookie retrieval'. The cookies are encodes them into a string suitable for use in a Cookie header in a HTTP request.

The exact processing of the cookies is subject to the $flags bit field, which is a combination of the 'context flags'. If set to undef the current default bit flags for the cookie jar will be used.

The $partition argument, is the name (scheme://host) of the first party site, which is used as a partition key. Cookies are only retrieved from this partition if the jar was instantiated with the enable_partition option.

This method in intended to be called from a user-agent in generation of a HTTP request.

Auxillary

get_cookies

$jar->get_cookies($request_url);
$jar->get_cookies($request_url, $partition, $flags); 

Takes the same arguments as retrieve_cookies and matches the same cookies. Returns a copied list of the matched cookie structures instead of a encoded string.

get_kv_cookies

$jar->get_kv_cookies($request_url);
$jar->get_kv_cookies($request_url, $same_site_status, $type, $safe);

Takes the same arguments as retrieve_cookies and matches the same cookies. Returns a list of key value pairs, of names and values.

HTTP::CookieJar Compatibility Interface

These methods of the same name an intent as those found in HTTP::CookieJar. There are most certainly minor differences, but should allow a HTTP::State cookie jar to be a drop in replacement in most circumstances a HTTP::CookieJar is used.

add

$jar->add($url, $set_cookie_string);

Adds a cookie (as a Set_Cookie value string $set_cookie_string) to the cookie jar for the request URL $url.

Note on SAME SITE: It is a compatibility wrapper around store_cookies utilising the default flags for the jar as no flags for same site support can be supplied directly.

Please refer to the HTTP::CookieJar for further information.

clear

$jar->clear;

Removes all cookies from the jar

cookies_for

$jar->cookies_for($url);
$jar->cookies_for($url);

Returns a list of hash refs representing a set cookie for a target $url. The elements of each hash are named as per HTTP::CookieJar. Additional elements could also exist (ie samesite)

Please refer to the HTTP::CookieJar for further information.

$jar->cookie_header($url)

Retrieves any applicable cookies for the target $url, and encodes into a Cookie header string value.

Note on SAME SITE: It is a compatibility wrapper around retrieve_cookies utilising the default flags for the jar, as no flags for same site support can be supplied directly.

Please refer to the HTTP::CookieJar for further information.

dump_cookies

$jar->dump_cookies;
$jar->dump_cookies({persistent => 1});

Returns a list of strings encoded as Set Cookie values, but with additional internal information. The {persistent => 1} parameter forces only persistent cookies to be processed, ignoring session cookies.

Adjusts the creation and last access times to be relative to epoch in the local time, instead of GMT for interoperability with HTTP::CookieJar.

Partitioned cookies are also included in the dump, with the partition key stored, for later loading.

load_cookies

$jar->load_cookies(@cookies)

Takes a list of Set-Cookie type strings written out previous with dump_cookies and injects them into the cookie jar.

Decodes the creation time and last access time expected in local timezone seconds.

Please refer to the HTTP::CookieJar for further information.

Partitioned cookies store the partition key in the Partitioned attribute. If this is present cookies are loaded into the specified partition.

PERFORMANCE

Relative performance of retrieving cookies for a request to a single host:

1 cookie:

                   Rate http_cookiejar  protocol_http     http_state
http_cookiejar 165217/s             --           -31%           -41%
protocol_http  238602/s            44%             --           -15%
http_state     279272/s            69%            17%             --

5 cookies:

                  Rate http_cookiejar     http_state  protocol_http
http_cookiejar  5582/s             --           -80%           -81%
http_state     27568/s           394%             --            -8%
protocol_http  29824/s           434%             8%             --

50 cookies:

                  Rate http_cookiejar     http_state  protocol_http
http_cookiejar 100.0/s             --           -97%           -97%
http_state      2973/s          2873%             --            -8%
protocol_http   3228/s          3128%             9%             --

100 cookies:

                 Rate http_cookiejar     http_state  protocol_http
http_cookiejar 23.1/s             --           -98%           -98%
http_state     1370/s          5835%             --            -5%
protocol_http  1436/s          6121%             5%             --

1000 cookies:

                  Rate http_cookiejar  protocol_http     http_state
http_cookiejar 0.233/s             --           -99%          -100%
protocol_http   43.8/s         18738%             --           -76%
http_state       179/s         76975%           309%             --

COMPARISON TO OTHER MODULES

Protocol::HTTP::CookieJar is a very fast cookie jar module, also implementing RFC6265bis-draft.

However, it's API is targeted towards its related HTTP client, doesn't support CHIPS/partitioning and requires a large number of XS modules

HTTP::CookieJar is the cookie jar expected by many user agents. While it has public suffix support, it doesn't provide samesite processing or the additional conditions of RFC6265bis-draft or CHIPS/paritioning. It is also quite slow in comparison to this module.

SEE ALSO

The main resources used in implementing this module:

https://github.com/privacycg/CHIPS#opt-in-partitioned-cookies
https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis
https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies

TODO

Create LWP adaptor
Mojo User Agnet adaptor
More tests

REPOSITORTY and BUGS

Please report any bugs via git hub: http://github.com/drclaw1394/perl-http-state

AUTHOR

Ruben Westerberg, <drclaw@mac.com>

COPYRIGHT AND LICENSE

Copyright (C) 2023 by Ruben Westerberg

Licensed under MIT

DISCLAIMER OF WARRANTIES

THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.