NAME
Perl::Stripper - Yet another PPI-based Perl source code stripper
VERSION
version 0.02
SYNOPSIS
use Perl::Stripper;
my $stripper = Perl::Stripper->new(
#maintain_linum => 1, # the default, keep line numbers unchanged
#strip_ws => 1, # the default, strip extra whitespace
#strip_comment => 1, # the default
#strip_pod => 1, # the default
strip_log => 1, # default is 0, strip Log::Any log statements
);
$stripped = $stripper->strip($perl);
DESCRIPTION
This module is yet another PPI-based Perl source code stripper. Its focus is on costumization and stripping significant information from source code.
This module uses Moo object system.
This module uses Log::Any logging framework.
ATTRIBUTES
maintain_linum => BOOL (default: 1)
If set to true, stripper will try to maintain line numbers so it does not change between the unstripped and the stripped version. This is useful for debugging.
Respected by other settings.
strip_ws => BOOL (default: 1)
Strip extra whitespace. Under maintain_linum
, will not strip newlines.
Not yet implemented.
strip_comment => BOOL (default: 1) | CODE
If set to true, will strip comments. Under maintain_linum
will replace comment lines with blank lines.
Can also be set to a coderef. Code will be given the PPI comment token object and expected to modify the object (e.g. using set_content()
method). See PPI::Token::Comment for more details. Some usage ideas: translate comment, replace comment with gibberish, etc.
strip_log => BOOL (default: 1)
If set to true, will strip log statements. Useful for removing debugging information. Currently Log::Any-specific and only looks for the default logger $log
. These will be stripped:
$log->METHOD(...);
$log->METHODf(...);
if ($log->is_METHOD) { ... }
Not all methods are stripped. See stripped_log_levels
.
Can also be set to a coderef. Code will be given the PPI::Statement object and expected to modify it.
stripped_log_levels => ARRAY_OF_STR (default: ['debug', 'trace'])
Log levels to strip. By default, only debug
and trace
are stripped. Levels info
and up are considered important for users (instead of for developers only).
strip_pod => BOOL (default: 1)
If set to true, will strip POD. Under maintain_linum
will replace POD with blank lines.
Can also be set to a coderef. Code will be given the PPI POD token object and expected to modify the object (e.g. using set_content()
method). See PPI::Token::Pod for more details.Some usage ideas: translate POD, convert POD to Markdown, replace POD with gibberish, etc.
METHODS
new(%attrs) => OBJ
Constructor.
$stripper->strip($perl) => STR
Strip Perl source code. Return the stripped source code.
TODO/IDEAS
Don't strip shebang line
Option to mangle subroutine names
With exclude and mangling options (dictionary, name mangler sub).
Option to mangle name of lexical variables
With exclude and mangling options (dictionary, name mangler sub).
Option to mangle name of global variables
With exclude and mangling options (dictionary, name mangler sub). And exclude Perl's predefined variables like
@ARGV
,%ENV
, and so on.Option to mangle labels
Option to remove comments and whitespace in /x regexes
Option to remove certain code blocks
Like:
if ($DEBUG) { ... } if ($PRODUCTION) { ... } assert(...)
FAQ
What is the use of this module?
This module can be used to remove debugging information (logging statements, conditional code) from source code.
This module can also be employed as part of source code protection strategy. In theory you cannot hide source code you deploy to users/clients, but you can reduce the usefulness of the deployed source code by removing information such as comments and POD (documentation), or by mangling subroutine/variable names (removing meaningful original subroutine/variable names).
For compressing source code (reducing source code size), you can try Perl::Squish or Perl::Strip.
But isn't hiding/protecting source code immoral/unethical/ungrateful?
Discussing hiding/protecting source code in general is really beyond the scope of this module's documentation. Please consult elsewhere.
How about obfuscating by encoding Perl code?
For example, changing:
foo();
bar();
into:
$src = base64_decode(...); # optionally multiple rounds
eval $src;
This does not really remove significant (meaningful) parts of a source code, so I'm not very interested in this approach. You can send a patch if you want.
How about changing string into hexadecimal characters? How about ...?
Other examples similar in spirit would be adding extra parentheses to expressions, changing constant numbers into mathematical expressions.
Again, this does not remove significant (meaningful) parts of a source code (instead, they just transform stuffs). The effect can be reversed trivially using Perl::Tidy or B::Deparse. So I'm not very interested in doing this, but you can send a patch if you want.
SEE ALSO
There are at least two approaches when analyzing/modifying/producing Perl code: B-based and PPI-based. In general, B-based modules are orders of magnitude faster than PPI-based ones, but each approach has its strengths and weaknesses.
B::Deparse - strips comments and extra newlines
B::Deobfuscate - like B::Deparse, but can also rename variables. Despite its name, if applied to a "normal" Perl code, the effect is obfuscation because it removes the original names (and meaning) of variables.
Perl::Strip - PPI-based, focus on compression.
Perl::Squish - PPI-based, focus on compression.
AUTHOR
Steven Haryanto <stevenharyanto@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2012 by Steven Haryanto.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.