NAME
Crypt::ASN1 - DER ASN.1 parser and encoder based on libtomcrypt
SYNOPSIS
use Crypt::ASN1 qw(asn1_decode_der asn1_encode_der asn1_to_string);
# --- decode ---
my $tree = asn1_decode_der($der_bytes);
my $tree = asn1_decode_der($der_bytes, { int => 'hex', bin => 'hex' });
# --- inspect ---
print asn1_to_string($tree);
# --- encode a decoded tree ---
my $der2 = asn1_encode_der($tree);
# --- build from scratch ---
my $der = asn1_encode_der([{
type => 'SEQUENCE',
value => [
{ type => 'INTEGER', value => '42' },
{ type => 'BOOLEAN', value => 1 },
{ type => 'OID', value => '1.2.840.113549.1.1.11' },
{ type => 'UTF8_STRING', value => 'hello' },
{ type => 'OCTET_STRING', value => "\x00\x01\x02" },
{ type => 'BIT_STRING', value => "\x03\x02\x01", bits => 20 },
{ type => 'NULL' },
{ type => 'UTCTIME', value => '2025-06-15T12:00:00Z' },
{ type => 'CUSTOM', class => 'CONTEXT_SPECIFIC',
constructed => 1, tag => 0,
value => [{ type => 'INTEGER', value => '2' }] },
],
}]);
DESCRIPTION
Since: CryptX-0.089
Parses DER-encoded ASN.1 data into a Perl data structure without requiring any schema, and encodes Perl data structures back to DER. Uses libtomcrypt's der_decode_sequence_flexi for decoding.
Both the decoder output and the encoder input use the same node hash structure described below. When given a tree produced by the decoder, the encoder does its best to produce the same ASN.1 that was originally parsed, regardless of what decode options were used.
EXPORT
Nothing is exported by default.
You can export selected functions:
use Crypt::ASN1 qw(asn1_decode_der asn1_encode_der);
Or all of them at once:
use Crypt::ASN1 ':all';
NODE HASH STRUCTURE
Both the decoder and encoder operate on the same data structure: an arrayref of node hashrefs. Each hashref represents one ASN.1 TLV (Tag-Length-Value) element.
Common keys
Every node has three keys:
type(string, required)-
The ASN.1 type name. Built-in values include:
BOOLEAN INTEGER NULL OID OCTET_STRING BIT_STRING UTF8_STRING PRINTABLE_STRING IA5_STRING TELETEX_STRING UTCTIME GENERALIZEDTIME SEQUENCE SET CUSTOMThe list above is not exhaustive for decoded input. If the decoder encounters an ASN.1 tag that does not map to one of the built-in type names above, it is returned as
CUSTOMwith the appropriateclass,constructed, andtagfields. This includes unsupported universal tags such asENUMERATED, which decode asCUSTOMwithclass => "UNIVERSAL". value(varies, required for most types)-
The decoded value. Its Perl type depends on
typeand sometimes on theformatkey -- see "Per-type details" below. format(string, decoder sets it, encoder reads it)-
Tells the encoder how the
valueis represented so it can convert it back to DER. Set automatically by the decoder; when building nodes from scratch you may omit it -- the encoder then assumes the default representation for each type.
Per-type details
Each subsection below documents one type. For types where the value representation depends on the decode option used, a format table lists every format/value combination. The encoder accepts every combination shown -- it reads format and converts value back to DER automatically.
BOOLEAN
Keys: type, format, value.
value is 1 (true) or 0 (false). format is always "bool".
{ type => "BOOLEAN", format => "bool", value => 1 }
INTEGER
Keys: type, format, value.
value is an arbitrary-precision signed integer. format describes the representation:
format value decode option example
-------- --------------------------- ---------------- ---------------
decimal decimal string (default) "255"
hex lowercase hex string int => 'hex' "ff"
bytes big-endian binary string int => 'bytes' "\xff"
All three forms are accepted by the encoder. When format is absent the encoder treats value as a decimal string (a Perl integer is fine too).
Negative integers: decimal and hex carry a leading - (e.g. "-5"). bytes stores the unsigned magnitude only and is intended for naturally unsigned values such as RSA moduli. When decoding with int => 'bytes', negative ASN.1 INTEGER values are rejected.
NULL
Keys: type, format, value.
value is always undef. format is always "null". The encoder ignores value.
{ type => "NULL", format => "null", value => undef }
OID
Keys: type, format, value, and optionally name.
value is a dotted-decimal OID string (at least two arcs). format is always "oid".
{ type => "OID", format => "oid", value => "1.2.840.113549.1.1.11" }
As a convenience, the encoder accepts textual arcs with leading zeros and lets DER encoding canonicalize them. For example, "2.000.1" encodes and decodes back as "2.0.1".
Optional key: name -- present only when the oidmap decode option is supplied and the OID is found in the map. Ignored by the encoder.
{ ..., name => "sha256WithRSAEncryption" } # when oidmap matches
OCTET_STRING
Keys: type, format, value.
value is binary data. format describes the representation:
format value decode option example
-------- --------------------------- ----------------- --------
bytes raw binary string (default) "\x04\x01"
hex lowercase hex string bin => 'hex' "0401"
base64 Base64-encoded string bin => 'base64' "BAE="
All three forms are accepted by the encoder. When format is absent the encoder treats value as raw bytes.
BIT_STRING
Keys: type, format, value, bits.
value is the packed bit data (MSB-first). format follows the same rules as OCTET_STRING ("bytes", "hex", or "base64"). All three forms are accepted by the encoder.
bits is the exact number of significant bits. The quantity 8 * byte_length(value) - bits gives the number of unused trailing bits in the last byte.
When format is absent the encoder treats value as raw bytes. When bits is absent it defaults to 8 * length(value) (no unused bits).
# default format (raw bytes, 25 significant bits)
{ type => "BIT_STRING", format => "bytes",
value => "\x03\x02\x01\x00", bits => 25 }
# hex format
{ type => "BIT_STRING", format => "hex",
value => "03020100", bits => 25 }
UTF8_STRING
Keys: type, format, value.
value is a Perl Unicode string (utf8 flag on). format is always "utf8".
{ type => "UTF8_STRING", format => "utf8", value => "caf\x{e9}" }
PRINTABLE_STRING, IA5_STRING, TELETEX_STRING
Keys: type, format, value.
value is a byte string. format is always "string" for all three.
{ type => "PRINTABLE_STRING", format => "string", value => "abc" }
{ type => "IA5_STRING", format => "string", value => "ia5" }
{ type => "TELETEX_STRING", format => "string", value => "tele" }
UTCTIME
Keys: type, format, value.
value is a timestamp. format describes the representation:
format value decode option example
-------- ----------------------------- --------------- -----------------------
rfc3339 RFC 3339 string (default) "2024-01-15T10:30:00Z"
epoch Unix timestamp (integer) dt => 'epoch' 1705314600
Both forms are accepted by the encoder. When format is absent, the encoder auto-detects: an all-digit value is treated as epoch, a value matching YYYY- is treated as RFC 3339.
For UTCTIME, encoder input must fall within the UTCTime year window 1950..2049; values outside that range are rejected. Fractional seconds are also rejected for UTCTIME.
Time validation in the encoder is currently syntactic, not full calendar validation. The encoder checks the accepted input shape and ASN.1-specific constraints above, but it does not verify that every RFC 3339-looking date and time is semantically valid.
The decoder expands the 2-digit UTCTime year using the RFC 5280 window (YY >= 50 → 19YY, else 20YY). Timezone offsets are preserved (e.g. "2024-01-15T10:30:00+05:30").
GENERALIZEDTIME
Keys: type, format, value.
Same format rules as UTCTIME; both forms are accepted by the encoder. Fractional seconds are preserved (e.g. "2024-01-15T10:30:00.125Z"). Validation is likewise syntactic only; semantically invalid calendar values that match the accepted timestamp syntax are not currently rejected.
SEQUENCE
Keys: type, format, value.
value is an arrayref of child node hashrefs (in order). format is always "array".
{ type => "SEQUENCE", format => "array", value => [ ...children... ] }
SET
Keys: type, format, value.
Same structure as SEQUENCE. format is always "array". Both ASN.1 SET and SET OF are represented as type => "SET" (they share the same DER tag 0x31).
CUSTOM
Represents any tag that does not map to one of the built-in type names above. This is commonly used for context-specific implicit/explicit tags ([0], [1], ...) found in X.509 certificates and other ASN.1 schemas, but it can also be emitted by the decoder for unsupported universal tags.
Keys: type, format, value, class, constructed, tag.
class(string) --"CONTEXT_SPECIFIC","APPLICATION","UNIVERSAL", or"PRIVATE"constructed(integer) --1if constructed,0if primitivetag(integer) -- the tag number (e.g.0for[0])-
Must be a non-negative integer within the range supported by the current encoder build.
Constructed (constructed => 1): value is an arrayref of child nodes. format is "array".
{ type => "CUSTOM", format => "array",
class => "CONTEXT_SPECIFIC", constructed => 1, tag => 0,
value => [ { type => "INTEGER", ... } ] }
Primitive (constructed => 0): value is raw data. format follows the same rules as OCTET_STRING ("bytes", "hex", or "base64" depending on the bin decode option). All three forms are accepted by the encoder. Primitive CUSTOM values must not be references.
# default format
{ type => "CUSTOM", format => "bytes",
class => "CONTEXT_SPECIFIC", constructed => 0, tag => 1,
value => "\xAA\xBB" }
# hex format (bin => 'hex')
{ type => "CUSTOM", format => "hex",
class => "CONTEXT_SPECIFIC", constructed => 0, tag => 1,
value => "aabb" }
Re-encoding Decoded Trees
The encoder reads format and converts value back to DER before encoding. When given a tree returned by asn1_decode_der, it does its best to produce the same ASN.1 that was originally parsed, regardless of the decode options used:
my $tree = asn1_decode_der($der, { int=>'hex', bin=>'base64', dt=>'epoch' });
my $der2 = asn1_encode_der($tree);
Building nodes from scratch
When constructing nodes by hand you need type and value (plus the extra keys noted above for CUSTOM and BIT_STRING). You may omit format; the encoder assumes:
Type default value interpretation
---------------- ------------------------------------------
INTEGER decimal string or Perl integer
OCTET_STRING raw bytes
BIT_STRING raw packed bytes, bits = length(value) * 8
UTCTIME RFC 3339 string (or all-digit epoch)
GENERALIZEDTIME RFC 3339 string (or all-digit epoch)
CUSTOM primitive raw bytes
You may also supply format explicitly if you prefer to work with hex or base64 representations:
# these two produce identical DER
{ type => "OCTET_STRING", value => "\x04\x01" }
{ type => "OCTET_STRING", format => "hex", value => "0401" }
FUNCTIONS
asn1_decode_der
my $tree = asn1_decode_der($der_bytes);
my $tree = asn1_decode_der($der_bytes, \%opts);
Parses $der_bytes and returns an arrayref of top-level node hashrefs. Croaks on parse error.
The optional %opts hashref controls value formatting:
int => 'hex' | 'bytes'-
How to represent
INTEGERvalues. Default is a decimal string (format=>"decimal").'hex'gives a lowercase hex string (format=>"hex").'bytes'gives a raw big-endian binary string (format=>"bytes") for non-negative INTEGER values only; decoding croaks if the DER INTEGER is negative. bin => 'hex' | 'base64'-
How to represent
OCTET_STRING,BIT_STRING, and primitiveCUSTOMvalues. Default is raw binary bytes (format=>"bytes").'hex'gives a lowercase hex string (format=>"hex").'base64'gives a Base64-encoded string (format=>"base64"). dt => 'epoch'-
How to represent
UTCTIMEandGENERALIZEDTIMEvalues. Default is an RFC 3339 string (format=>"rfc3339").'epoch'gives a Unix timestamp integer (format=>"epoch"). This works reliably only on Perls with 64-bit integers; on 32-bit integer Perls, large timestamps may overflow or lose precision. oidmap => \%map-
A hashref mapping dotted OID strings to friendly names. When a decoded
OIDnode's value exists as a key in%map, the node gets an additionalnamekey with the mapped value. Does not affect encoding.
asn1_decode_pem
my $tree = asn1_decode_pem($pem_string);
my $tree = asn1_decode_pem($pem_string, \%opts);
Decodes the PEM envelope first (via "pem_to_der" in Crypt::Misc), then parses the resulting DER bytes. Accepts the same %opts as asn1_decode_der.
asn1_decode_der_file
my $tree = asn1_decode_der_file($filename);
my $tree = asn1_decode_der_file($filename, \%opts);
Reads $filename as raw binary and parses it as DER.
asn1_decode_pem_file
my $tree = asn1_decode_pem_file($filename);
my $tree = asn1_decode_pem_file($filename, \%opts);
Reads $filename, decodes the PEM envelope, then parses the DER bytes.
asn1_encode_der
my $der_bytes = asn1_encode_der($tree);
Encodes $tree (an arrayref of node hashrefs) to DER bytes. The input may be a tree previously returned by asn1_decode_der or one built from scratch. Croaks on invalid input.
The encoder normalizes every node before encoding: it reads format (if present) to determine how to interpret value, converts it to the canonical DER form, and encodes it.
The current low-level encoder supports element content lengths up to 0xffffffff bytes; larger values are rejected.
asn1_encode_pem
my $pem_string = asn1_encode_pem($tree, $header);
Encodes $tree to DER, then wraps in a PEM envelope with the given $header (e.g. "CERTIFICATE", "RSA PRIVATE KEY"). Defaults to "DATA" if $header is omitted.
asn1_encode_der_file
asn1_encode_der_file($tree, $filename);
Encodes $tree to DER and writes it to $filename.
asn1_encode_pem_file
asn1_encode_pem_file($tree, $header, $filename);
Encodes $tree to PEM and writes it to $filename.
asn1_to_string
my $text = asn1_to_string($tree);
Returns a human-readable text representation of $tree (an arrayref of node hashrefs as returned by any asn1_decode_* function). Useful for debugging and inspection, similar to openssl asn1parse output.
print asn1_to_string(asn1_decode_pem_file("cert.pem"));
produces output like:
SEQUENCE (3 elem)
SEQUENCE (8 elem)
context_specific [0] cons (1 elem)
INTEGER:2
INTEGER:17923815188543234454
SEQUENCE (2 elem)
OBJECT:1.2.840.113549.1.1.11
NULL:
...
BIT STRING:3082010a0282010100c242299a49420c21dcf9b957afcdc49... (2160 bit)
Binary values (OCTET_STRING, BIT_STRING, primitive CUSTOM) are shown as lowercase hex, truncated to 64 characters with ... for longer values. BIT_STRING additionally shows the bit count in parentheses. OID nodes that have a name key (via oidmap) show the name in parentheses after the dotted value.
The function handles trees decoded with any combination of decode options (int, bin, dt).