NAME
find-secret-leakage-in-git-diff.pl - find secrets leaking in a Git repository
VERSION
version 4.0.0
SYNOPSIS
find-secret-leakage-in-git-diff.pl [FILE]
DESCRIPTION
This script reads from a FILE or from STDIN the output of a git-diff command containing a patch and tries to detect secrets in the lines being added. It's intended to be invoked by the Git::Hooks::CheckDiff plugin, which feds it the output of either git-diff-index or git-diff-tree with the following options:
git diff* -p -U0 --no-color --diff-filter=AM --no-prefix
A "secret" is an API key, an authorization token, or a private key, which shouldn't be leaked by being saved in a versioned file. So, this script should be used in a pre-commit hook in order to alert the programmer when she does that.
When it finds a secret in the git-diff output it outputs a line like this:
<path>:<lineno>: Secret Leakage: <secret type> '<secret>'
Meaning:
<path>
The path of the file adding the secret.
<lineno>
The line number in the file where the secret is being added.
<secret type>
The type of the secret found.
<secret>
The specific secret found.
Sometimes you need to have a pseudo-secret in a file. Perhaps it's a credential used only in your test environment or as an example. You can mark these secrets so that this script disregards them. If you can, add the following mark in the same line of your pseudo-secret, like this:
my $aws_access_key = 'AKIA1234567890ABCDEF'; ## not a secret leak
The mark is the string ## not a secret leak
. The two hashes are part of it!
Sometimes you can't put the mark in the same line. Lines beginning private keys, for example, do not have room for anything else. In these cases you can skip a whole block marking its beginning and end like this:
## not a secret leak begin
my $rsa_private_key = <<EOS;
-----BEGIN RSA PRIVATE KEY-----
izfrNTmQLnfsLzi2Wb9xPz2Qj9fQYGgeug3N2MkDuVHwpPcgkhHkJgCQuuvT+qZI
MbS2U6wTS24SZk5RunJIUkitRKeWWMS28SLGfkDs1bBYlSPa5smAd3/q1OePi4ae
<...>
8S86b6zEmkser+SDYgGketS2DZ4hB+vh2ujSXmS8Gkwrn+BfHMzkbtio8lWbGw0l
eM1tfdFZ6wMTLkxRhBkBK4JiMiUMvpERyPib6a2L6iXTfH+3RUDS6A==
-----END RSA PRIVATE KEY-----
EOS
## not a secret leak end
None of the lines inside the block will be denounced as leaks.
NAME
find-secret-leakage-in-git-diff.pl - Find secrets leakage in a Git diff
EXIT CODES
The script exits with the number of secrets found. So, it succeeds if no secret is found and fails if it finds at least one.
SEE ALSO
How Bad Can It Git? Characterizing Secret Leakage in Public GitHub Repositories
This blog post summarizes a paper by the same name which studies how secrets such as API keys, authorization tokens, and private keys are commonly leaked by being inadvertently pushed to GitHub directories. The study found that this much more common than one would think and tells which kind of secrets are most commonly leaked like that. Moreover, it shows specific regular expressions which can be used to detect such secrets in text. This is the main source of inspiration for this script.
AUTHOR
Gustavo L. de M. Chaves <gnustavo@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2024 by CPQD <www.cpqd.com.br>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.