NAME

Locale::Simple::Scraper - scraper to find translation tokens in a directory

VERSION

version 0.109

SYNOPSIS

use Locale::Simple::Scraper 'scrape';

# Parse @ARGV-style options; emits a .pot template to STDOUT.
scrape(
    '--ignore', 'node_modules',
    '--ignore', 'build',
    '--output', 'po',
);

DESCRIPTION

Locale::Simple::Scraper walks the current working directory, parses recognised source files with Parser::MGC, extracts every l()/ln()/lp()/lnp()/ld()/ldn()/ldp()/ldnp() call it can resolve statically, and writes a .pot (gettext template) or one of a few alternative output formats.

Usually invoked through bin/locale_simple_scraper; the scrape function is exported so build tools can drive it directly.

EXPORTED FUNCTIONS

scrape(@argv)

Run the scraper. @argv is parsed with Getopt::Long. Reads files from the current working directory recursively and prints to STDOUT.

OPTIONS

--js=EXT, --pl=EXT, --py=EXT, --tx=EXT

Comma-separated list of extra file extensions to recognise as the given language. The built-in defaults (.js; .pl/.pm/.t; .py; .tx) are always included.

--ignore=REGEX

Skip any file whose path matches this regex. Repeatable.

--only=REGEX

Process only files whose path matches one of the given regexes. Repeatable; combines with --ignore.

--output=FORMAT

Output format — po (default), perl (Data::Dumper), json (requires JSON) or yaml (requires YAML).

--md5

Hash filenames with MD5 before writing them — useful when paths may contain sensitive information. Requires Digest::MD5.

--no_line_numbers

Omit #: source-location comments from the .pot.

SUPPORTED CALLS

Recognised call shapes (identical across Perl, Python and JavaScript):

l( msgid, ... )
ln( msgid, msgid_plural, n, ... )
lp( msgctxt, msgid, ... )
lnp( msgctxt, msgid, msgid_plural, n, ... )
ld( domain, msgid, ... )
ldn( domain, msgid, msgid_plural, n, ... )
ldp( domain, msgctxt, msgid, ... )
ldnp( domain, msgctxt, msgid, msgid_plural, n, ... )

Strings must be statically resolvable — literal strings and static concatenation ("a" . "b" in Perl/Python, "a" + "b" in JS, "a" ~ "b" in Xslate Kolon). Runtime interpolation is not extracted; keep msgids constant and pass dynamic data as sprintf arguments.

SEE ALSO

Locale::Simple, Locale::Simple::Scraper::Parser.

SUPPORT

Issues

Please report bugs and feature requests on GitHub at https://github.com/Getty/locale-simple/issues.

CONTRIBUTING

Contributions are welcome! Please fork the repository and submit a pull request.

AUTHOR

Torsten Raudssus <getty@cpan.org>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2026 by Torsten Raudssus https://raudssus.de/.

This is free software, licensed under:

The MIT (X11) License