Security Advisories (24)
CVE-2011-2728 (2012-12-21)

The bsd_glob function in the File::Glob module for Perl before 5.14.2 allows context-dependent attackers to cause a denial of service (crash) via a glob expression with the GLOB_ALTDIRFUNC flag, which triggers an uninitialized pointer dereference.

CVE-2020-12723 (2020-06-05)

regcomp.c in Perl before 5.30.3 allows a buffer overflow via a crafted regular expression because of recursive S_study_chunk calls.

CVE-2020-10878 (2020-06-05)

Perl before 5.30.3 has an integer overflow related to mishandling of a "PL_regkind[OP(n)] == NOTHING" situation. A crafted regular expression could lead to malformed bytecode with a possibility of instruction injection.

CVE-2020-10543 (2020-06-05)

Perl before 5.30.3 on 32-bit platforms allows a heap-based buffer overflow because nested regular expression quantifiers have an integer overflow.

CVE-2018-6913 (2018-04-17)

Heap-based buffer overflow in the pack function in Perl before 5.26.2 allows context-dependent attackers to execute arbitrary code via a large item count.

CVE-2018-18314 (2018-12-07)

Perl before 5.26.3 has a buffer overflow via a crafted regular expression that triggers invalid write operations.

CVE-2018-18313 (2018-12-07)

Perl before 5.26.3 has a buffer over-read via a crafted regular expression that triggers disclosure of sensitive information from process memory.

CVE-2018-18312 (2018-12-05)

Perl before 5.26.3 and 5.28.0 before 5.28.1 has a buffer overflow via a crafted regular expression that triggers invalid write operations.

CVE-2018-18311 (2018-12-07)

Perl before 5.26.3 and 5.28.x before 5.28.1 has a buffer overflow via a crafted regular expression that triggers invalid write operations.

CVE-2015-8853 (2016-05-25)

The (1) S_reghop3, (2) S_reghop4, and (3) S_reghopmaybe3 functions in regexec.c in Perl before 5.24.0 allow context-dependent attackers to cause a denial of service (infinite loop) via crafted utf-8 data, as demonstrated by "a\x80."

CVE-2013-1667 (2013-03-14)

The rehash mechanism in Perl 5.8.2 through 5.16.x allows context-dependent attackers to cause a denial of service (memory consumption and crash) via a crafted hash key.

CVE-2010-4777 (2014-02-10)

The Perl_reg_numbered_buff_fetch function in Perl 5.10.0, 5.12.0, 5.14.0, and other versions, when running with debugging enabled, allows context-dependent attackers to cause a denial of service (assertion failure and application exit) via crafted input that is not properly handled when using certain regular expressions, as demonstrated by causing SpamAssassin and OCSInventory to crash.

CVE-2010-1158 (2010-04-20)

Integer overflow in the regular expression engine in Perl 5.8.x allows context-dependent attackers to cause a denial of service (stack consumption and application crash) by matching a crafted regular expression against a long string.

CVE-2009-3626 (2009-10-29)

Perl 5.10.1 allows context-dependent attackers to cause a denial of service (application crash) via a UTF-8 character with a large, invalid codepoint, which is not properly handled during a regular-expression match.

CVE-2005-3962 (2005-12-01)

Integer overflow in the format string functionality (Perl_sv_vcatpvfn) in Perl 5.9.2 and 5.8.6 Perl allows attackers to overwrite arbitrary memory and possibly execute arbitrary code via format string specifiers with large values, which causes an integer wrap and leads to a buffer overflow, as demonstrated using format string vulnerabilities in Perl applications.

CVE-2012-5195 (2012-12-18)

Heap-based buffer overflow in the Perl_repeatcpy function in util.c in Perl 5.12.x before 5.12.5, 5.14.x before 5.14.3, and 5.15.x before 15.15.5 allows context-dependent attackers to cause a denial of service (memory consumption and crash) or possibly execute arbitrary code via the 'x' string repeat operator.

CVE-2016-2381 (2016-04-08)

Perl might allow context-dependent attackers to bypass the taint protection mechanism in a child process via duplicate environment variables in envp.

CVE-2013-7422 (2015-08-16)

Integer underflow in regcomp.c in Perl before 5.20, as used in Apple OS X before 10.10.5 and other products, allows context-dependent attackers to execute arbitrary code or cause a denial of service (application crash) via a long digit string associated with an invalid backreference within a regular expression.

CVE-2011-1487 (2011-04-11)

The (1) lc, (2) lcfirst, (3) uc, and (4) ucfirst functions in Perl 5.10.x, 5.11.x, and 5.12.x through 5.12.3, and 5.13.x through 5.13.11, do not apply the taint attribute to the return value upon processing tainted input, which might allow context-dependent attackers to bypass the taint protection mechanism via a crafted string.

CVE-2023-47100

In Perl before 5.38.2, S_parse_uniprop_string in regcomp.c can write to unallocated space because a property name associated with a \p{...} regular expression construct is mishandled. The earliest affected version is 5.30.0.

CVE-2024-56406 (2025-04-13)

A heap buffer overflow vulnerability was discovered in Perl. When there are non-ASCII bytes in the left-hand-side of the `tr` operator, `S_do_trans_invmap` can overflow the destination pointer `d`.    $ perl -e '$_ = "\x{FF}" x 1000000; tr/\xFF/\x{100}/;'    Segmentation fault (core dumped) It is believed that this vulnerability can enable Denial of Service and possibly Code Execution attacks on platforms that lack sufficient defenses.

CVE-2023-47039 (2023-10-30)

Perl for Windows relies on the system path environment variable to find the shell (cmd.exe). When running an executable which uses Windows Perl interpreter, Perl attempts to find and execute cmd.exe within the operating system. However, due to path search order issues, Perl initially looks for cmd.exe in the current working directory. An attacker with limited privileges can exploit this behavior by placing cmd.exe in locations with weak permissions, such as C:\ProgramData. By doing so, when an administrator attempts to use this executable from these compromised locations, arbitrary code can be executed.

CVE-2016-1238 (2016-08-02)

(1) cpan/Archive-Tar/bin/ptar, (2) cpan/Archive-Tar/bin/ptardiff, (3) cpan/Archive-Tar/bin/ptargrep, (4) cpan/CPAN/scripts/cpan, (5) cpan/Digest-SHA/shasum, (6) cpan/Encode/bin/enc2xs, (7) cpan/Encode/bin/encguess, (8) cpan/Encode/bin/piconv, (9) cpan/Encode/bin/ucmlint, (10) cpan/Encode/bin/unidump, (11) cpan/ExtUtils-MakeMaker/bin/instmodsh, (12) cpan/IO-Compress/bin/zipdetails, (13) cpan/JSON-PP/bin/json_pp, (14) cpan/Test-Harness/bin/prove, (15) dist/ExtUtils-ParseXS/lib/ExtUtils/xsubpp, (16) dist/Module-CoreList/corelist, (17) ext/Pod-Html/bin/pod2html, (18) utils/c2ph.PL, (19) utils/h2ph.PL, (20) utils/h2xs.PL, (21) utils/libnetcfg.PL, (22) utils/perlbug.PL, (23) utils/perldoc.PL, (24) utils/perlivp.PL, and (25) utils/splain.PL in Perl 5.x before 5.22.3-RC2 and 5.24 before 5.24.1-RC2 do not properly remove . (period) characters from the end of the includes directory array, which might allow local users to gain privileges via a Trojan horse module under the current working directory.

CVE-2015-8608 (2017-02-07)

The VDir::MapPathA and VDir::MapPathW functions in Perl 5.22 allow remote attackers to cause a denial of service (out-of-bounds read) and possibly execute arbitrary code via a crafted (1) drive letter or (2) pInName argument.

NAME

enc2xs -- Perl Encode Module Generator

SYNOPSIS

enc2xs -[options]
enc2xs -M ModName mapfiles...
enc2xs -C

DESCRIPTION

enc2xs builds a Perl extension for use by Encode from either Unicode Character Mapping files (.ucm) or Tcl Encoding Files (.enc). Besides being used internally during the build process of the Encode module, you can use enc2xs to add your own encoding to perl. No knowledge of XS is necessary.

Quick Guide

If you want to know as little about Perl as possible but need to add a new encoding, just read this chapter and forget the rest.

0.

Have a .ucm file ready. You can get it from somewhere or you can write your own from scratch or you can grab one from the Encode distribution and customize it. For the UCM format, see the next Chapter. In the example below, I'll call my theoretical encoding myascii, defined in my.ucm. $ is a shell prompt.

$ ls -F
my.ucm
1.

Issue a command as follows;

$ enc2xs -M My my.ucm
generating Makefile.PL
generating My.pm
generating README
generating Changes

Now take a look at your current directory. It should look like this.

$ ls -F
Makefile.PL   My.pm         my.ucm        t/

The following files were created.

Makefile.PL - MakeMaker script
My.pm       - Encode submodule
t/My.t      - test file
1.1.

If you want *.ucm installed together with the modules, do as follows;

$ mkdir Encode
$ mv *.ucm Encode
$ enc2xs -M My Encode/*ucm
2.

Edit the files generated. You don't have to if you have no time AND no intention to give it to someone else. But it is a good idea to edit the pod and to add more tests.

3.

Now issue a command all Perl Mongers love:

$ perl Makefile.PL
Writing Makefile for Encode::My
4.

Now all you have to do is make.

$ make
cp My.pm blib/lib/Encode/My.pm
/usr/local/bin/perl /usr/local/bin/enc2xs -Q -O \
  -o encode_t.c -f encode_t.fnm
Reading myascii (myascii)
Writing compiled form
128 bytes in string tables
384 bytes (75%) saved spotting duplicates
1 bytes (0.775%) saved using substrings
....
chmod 644 blib/arch/auto/Encode/My/My.bs
$

The time it takes varies depending on how fast your machine is and how large your encoding is. Unless you are working on something big like euc-tw, it won't take too long.

5.

You can "make install" already but you should test first.

$ make test
PERL_DL_NONLAZY=1 /usr/local/bin/perl -Iblib/arch -Iblib/lib \
  -e 'use Test::Harness  qw(&runtests $verbose); \
  $verbose=0; runtests @ARGV;' t/*.t
t/My....ok
All tests successful.
Files=1, Tests=2,  0 wallclock secs
 ( 0.09 cusr + 0.01 csys = 0.09 CPU)
6.

If you are content with the test result, just "make install"

7.

If you want to add your encoding to Encode's demand-loading list (so you don't have to "use Encode::YourEncoding"), run

enc2xs -C

to update Encode::ConfigLocal, a module that controls local settings. After that, "use Encode;" is enough to load your encodings on demand.

The Unicode Character Map

Encode uses the Unicode Character Map (UCM) format for source character mappings. This format is used by IBM's ICU package and was adopted by Nick Ing-Simmons for use with the Encode module. Since UCM is more flexible than Tcl's Encoding Map and far more user-friendly, this is the recommended formet for Encode now.

A UCM file looks like this.

#
# Comments
#
<code_set_name> "US-ascii" # Required
<code_set_alias> "ascii"   # Optional
<mb_cur_min> 1             # Required; usually 1
<mb_cur_max> 1             # Max. # of bytes/char
<subchar> \x3F             # Substitution char
#
CHARMAP
<U0000> \x00 |0 # <control>
<U0001> \x01 |0 # <control>
<U0002> \x02 |0 # <control>
....
<U007C> \x7C |0 # VERTICAL LINE
<U007D> \x7D |0 # RIGHT CURLY BRACKET
<U007E> \x7E |0 # TILDE
<U007F> \x7F |0 # <control>
END CHARMAP
  • Anything that follows # is treated as a comment.

  • The header section continues until a line containing the word CHARMAP. This section has a form of <keyword> value, one pair per line. Strings used as values must be quoted. Barewords are treated as numbers. \xXX represents a byte.

    Most of the keywords are self-explanatory. subchar means substitution character, not subcharacter. When you decode a Unicode sequence to this encoding but no matching character is found, the byte sequence defined here will be used. For most cases, the value here is \x3F; in ASCII, this is a question mark.

  • CHARMAP starts the character map section. Each line has a form as follows:

    <UXXXX> \xXX.. |0 # comment
      ^     ^      ^
      |     |      +- Fallback flag
      |     +-------- Encoded byte sequence
      +-------------- Unicode Character ID in hex

    The format is roughly the same as a header section except for the fallback flag: | followed by 0..3. The meaning of the possible values is as follows:

    |0

    Round trip safe. A character decoded to Unicode encodes back to the same byte sequence. Most characters have this flag.

    |1

    Fallback for unicode -> encoding. When seen, enc2xs adds this character for the encode map only.

    |2

    Skip sub-char mapping should there be no code point.

    |3

    Fallback for encoding -> unicode. When seen, enc2xs adds this character for the decode map only.

  • And finally, END OF CHARMAP ends the section.

When you are manually creating a UCM file, you should copy ascii.ucm or an existing encoding which is close to yours, rather than write your own from scratch.

When you do so, make sure you leave at least U0000 to U0020 as is, unless your environment is EBCDIC.

CAVEAT: not all features in UCM are implemented. For example, icu:state is not used. Because of that, you need to write a perl module if you want to support algorithmical encodings, notably the ISO-2022 series. Such modules include Encode::JP::2022_JP, Encode::KR::2022_KR, and Encode::TW::HZ.

Coping with duplicate mappings

When you create a map, you SHOULD make your mappings round-trip safe. That is, encode('your-encoding', decode('your-encoding', $data)) eq $data stands for all characters that are marked as |0. Here is how to make sure:

  • Sort your map in Unicode order.

  • When you have a duplicate entry, mark either one with '|1' or '|3'.

  • And make sure the '|1' or '|3' entry FOLLOWS the '|0' entry.

Here is an example from big5-eten.

<U2550> \xF9\xF9 |0
<U2550> \xA2\xA4 |3

Internally Encoding -> Unicode and Unicode -> Encoding Map looks like this;

 E to U               U to E
 --------------------------------------
 \xF9\xF9 => U2550    U2550 => \xF9\xF9
 \xA2\xA4 => U2550

So it is round-trip safe for \xF9\xF9. But if the line above is upside down, here is what happens.

E to U               U to E
--------------------------------------
\xA2\xA4 => U2550    U2550 => \xF9\xF9
(\xF9\xF9 => U2550 is now overwritten!)

The Encode package comes with ucmlint, a crude but sufficient utility to check the integrity of a UCM file. Check under the Encode/bin directory for this.

When in doubt, you can use ucmsort, yet another utility under Encode/bin directory.

Bookmarks

SEE ALSO

Encode, perlmod, perlpod

7 POD Errors

The following errors were encountered while parsing the POD:

Around line 1076:

Expected text after =item, not a number

Around line 1109:

Expected text after =item, not a number

Around line 1115:

Expected text after =item, not a number

Around line 1122:

Expected text after =item, not a number

Around line 1143:

Expected text after =item, not a number

Around line 1156:

Expected text after =item, not a number

Around line 1160:

Expected text after =item, not a number