Actions Status MetaCPAN Release

NAME

greple - extensible grep with lexical expression and region control

VERSION

Version 10.02

SYNOPSIS

greple [-Mmodule] [ -options ] pattern [ file... ]

PATTERN
  pattern              'and +must -not ?optional &function'
  -x, --le   pattern   lexical expression (same as bare pattern)
  -e, --and  pattern   pattern match across line boundary
  -r, --must pattern   pattern cannot be compromised
  -t, --may  pattern   pattern may exist
  -v, --not  pattern   pattern not to be matched
  -E, --re   pattern   regular expression
      --fe   pattern   fixed expression
  -f, --file file      file contains search pattern
  --select index       select indexed pattern from -f file
MATCH
  -i, --ignore-case    ignore case
  -G, --capture-group  match capture groups rather than the whole pattern
  -S, --stretch        stretch the matched area to the enclosing block
  --need=[+-]n         required positive match count
  --allow=[+-]n        acceptable negative match count
  --matchcount=n[,m]   required match count for each block
STYLE
  -l                   list filename only
  -c                   print count of matched block only
  -n                   print line number
  -b                   print block number
  -H, -h               do or do not display filenames
  -o                   print only the matching part
  --all                print entire data
  -m, --max=n[,m]      max count of blocks to be shown
  -A,-B,-C [n]         after/before/both match context
  --join               remove newline in the matched part
  --joinby=string      replace newline in the matched text with a string
  --nonewline          do not add newline character at the end of block
  --filestyle=style    how filenames are printed (once, separate, line)
  --linestyle=style    how line numbers are printed (separate, line)
  --blockstyle=style   how block numbers are printed (separate, line)
  --separate           set filestyle, linestyle, blockstyle "separate"
  --format LABEL=...   define the format for line number and file name
  --frame-top          top frame line
  --frame-middle       middle frame line
  --frame-bottom       bottom frame line
FILE
  --glob=glob          glob target files
  --chdir=dir          change directory before search
  --readlist           get filenames from stdin
COLOR
  --color=when         use terminal colors (auto, always, never)
  --nocolor            same as --color=never
  --colormap=color     R, G, B, C, M, Y, etc.
  --colorsub=...       shortcut for --colormap="sub{...}"
  --colorful           use default multiple colors
  --colorindex=flags   color index method: Ascend/Descend/Block/Random/Unique/Group/GP
  --random             use a random color each time (--colorindex=R)
  --uniqcolor          use a different color for each unique string (--colorindex=U)
  --uniqsub=func       preprocess function to check uniqueness
  --ansicolor=s        ANSI color 16, 256 or 24bit
  --[no]256            same as --ansicolor 256 or 16
  --regioncolor        use different color for inside and outside regions
  --face               enable or disable visual effects
BLOCK
  -p, --paragraph      enable paragraph mode
  --border=pattern     specify a border pattern
  --block=pattern      specify a block of records
  --blockend=s         block-end mark (Default: "--")
  --join-blocks        join consecutive blocks that are back-to-back
REGION
  --inside=pattern     select matches inside of pattern
  --outside=pattern    select matches outside of pattern
  --include=pattern    limit matches to the area
  --exclude=pattern    limit matches to outside of the area
  --strict             enable strict mode for --inside/outside --block
CHARACTER CODE
  --icode=name         input file encoding
  --ocode=name         output file encoding
FILTER
  --if,--of=filter     input/output filter command
  --pf=filter          post-process filter command
  --noif               disable the default input filter
RUNTIME FUNCTION
  --begin=func         call a function before starting the search
  --end=func           call a function after completing the search
  --prologue=func      call a function before executing the command
  --epilogue=func      call a function after executing the command
  --postgrep=func      call a function after each grep operation
  --callback=func      callback function for each matched string
OTHER
  --usage[=expand]     show this help message
  --exit=n             set the command exit status
  --norc               skip reading startup file
  --man                display the manual page for the command or module
  --show               display the module file contents
  --path               display the path to the  module file
  --error=action       action to take after a read error occurs
  --warn=type          runtime error handling type
  --alert [name=#]     set alert parameters (size/time)
  -d flags             display info (f:file d:dir c:color m:misc s:stat)

INSTALL

CPANMINUS

$ cpanm App::Greple

SUMMARY

greple is a grep-like tool designed for searching structured text such as source code and documents. Key features include:

While it can be used for general text search, greple excels at searching source code, structured documents, and multi-byte text where context and precision matter.

DESCRIPTION

MULTIPLE KEYWORDS

AND

greple can take multiple search patterns with the -e option, but unlike the egrep(1) command, it will search them in AND context. For example, the next command prints lines that contain all of foo and bar and baz.

greple -e foo -e bar -e baz ...

Each word can appear in any order and any place in the string. So this command finds all of the following lines.

foo bar baz
baz bar foo
the foo, bar and baz

OR

If you want to use OR syntax, use regular expression.

greple -e foo -e bar -e baz -e 'yabba|dabba|doo'

This command will print lines that contain all of foo, bar and baz and one or more of yabba, dabba or doo.

Multiple patterns may be described in a file line-by-line, and specified with the -f option. In that case, the blocks matching any of the patterns in that file will be displayed.

greple -f patterns ...

If multiple files are specified, the patterns in the files are evaluated in the OR and each file in the AND context.

The two commands below work the same way.

greple -f <(echo $'foo\nbar\nbaz') -f <(echo $'yabba\ndabba\ndoo')

greple -e 'foo|bar|baz' -e 'yabba|dabba|doo'

NOT

Use option -v to specify keyword which should not be found in the data record. Next example shows lines that contain both foo and bar but none of yabba, dabba or doo.

greple -e foo -e bar -v yabba -v dabba -v doo
greple -e foo -e bar -v 'yabba|dabba|doo'

MAY

When you are focusing on multiple words, there may be words that are not necessary but would be of interest if they were present.

Use option --may or -t (tentative) to specify that kind of words. They will be a subject of search, and highlighted if they exist, but are optional.

Next command prints all lines including foo and bar, and highlights baz as well.

greple -e foo -e bar -t baz

MUST

Option --must or -r is another way to specify optional keyword. If required keyword exists, all other positive match keyword becomes optional. Next command is equivalent to the above example.

greple -r foo -r bar -e baz

LEXICAL EXPRESSION

greple takes the first argument as a search pattern specified by --le option. In the --le pattern, you can set multiple keywords in a single parameter. Each keyword is separated by spaces, and the first letter describes its type.

none  And pattern            : --and  -e
+     Required pattern       : --must -r
-     Negative match pattern : --not  -v
?     Optional pattern       : --may  -t

Just like internet search engines, you can simply provide foo bar baz to search lines including all of them.

greple 'foo bar baz'

Next command shows lines which include foo, but do not include bar, and highlights baz if it exists.

greple 'foo -bar ?baz'

greple searches a given pattern across line boundaries. This is especially useful to handle Asian multi-byte text, more specifically Japanese. Japanese text can be separated by newline almost any place in the text. So the search pattern may spread out onto multiple lines.

As for the ASCII word list, the space character in the pattern matches any type of space, including newlines. The next example will search for the word sequence of foo, bar and baz, even if they are spread over lines.

greple -e 'foo bar baz'

Option -e is necessary because space is taken as a token separator in the bare or --le pattern.

FLEXIBLE BLOCKS

Default data block greple searches and prints is a line. Using --paragraph (or -p in short) option, series of text separated by empty line is taken as a record block. So the next command prints whole paragraph which contains the word foo, bar and baz.

greple -p 'foo bar baz'

Block can also be defined by pattern. Next command treats the data as a series of 10-line units.

greple -n --border='(.*\n){1,10}'

You can also define arbitrary complex blocks by writing module or script.

greple -Myour_module --block '&your_function' ...

MATCH AREA CONTROL

Using option --inside and --outside, you can specify the text area to be matched. Next commands search only in mail header and body area respectively. In these cases, data block is not changed, so print lines which contain the pattern in the specified area.

greple --inside '\A(.+\n)+' pattern

greple --outside '\A(.+\n)+' pattern

Option --inside/--outside can be used repeatedly to enhance the area to be matched. There are similar options --include/--exclude, but they are used to trim down the area.

These four options also take user defined function and any complex region can be used.

MODULE AND CUSTOMIZATION

User can define default and original options in ~/.greplerc. Next example enables colored output always, and define new option using macro processing.

option default --color=always

define :re1 complex-regex-1
define :re2 complex-regex-2
define :re3 complex-regex-3
option --newopt --inside :re1 --exclude :re2 --re :re3

Specific set of function and option interface can be implemented as module. Modules are invoked by -M option immediately after command name.

For example, greple does not have recursive search option, but it can be implemented by --readlist option which accepts target file list from standard input. Using find module, it can be written like this:

greple -Mfind . -type f -- pattern

Also dig module implements more complex search. It can be used as simple as this:

greple -Mdig pattern --dig .

but this command is finally translated into following option list.

greple -Mfind . ( -name .git -o -name .svn -o -name RCS ) -prune -o
    -type f ! -name .* ! -name *,v ! -name *~
    ! -iname *.jpg ! -iname *.jpeg ! -iname *.gif ! -iname *.png
    ! -iname *.tar ! -iname *.tbz  ! -iname *.tgz ! -iname *.pdf
    -print -- pattern

INCLUDED MODULES

The distribution includes some sample modules. Read document in each module for detail. You can read the document by --man option or perldoc command.

greple -Mdig --man

perldoc App::Greple::dig

When it does not work, use perldoc App::Greple::dig.

Other modules are available at CPAN, or git repository https://github.com/kaz-utashiro/.

OPTIONS

PATTERNS

If no positive pattern option is given (i.e. other than --not and --may), greple takes the first argument as a search pattern specified by --le option. All of these patterns can be specified multiple times.

Command itself is written in Perl, and any kind of Perl style regular expression can be used in patterns. See perlre(1) for detail.

Note that multiple line modifier (m) is set when executed, so put (?-m) at the beginning of regex if you want to explicitly disable it.

Order of capture group in the pattern is not guaranteed. Please avoid to use direct index, and use relative or named capture group instead. For example, if you want to search repeated characters, use (\w)\g{-1} or (?<c>\w)\g{c} rather than (\w)\1.

Extended Bracketed Character Classes ((?[...])) and Variable Length Lookbehind can be used without warnings. See "Extended Bracketed Character Classes" in perlrecharclass and "(?<=pattern)" in perlre.

In the above pattern options, space characters are treated specially. They are replaced by the pattern which matches any number of white spaces including newline. So the pattern can expand to multiple lines. Next commands search the series of word foo bar baz even if they are separated by newlines.

greple -e 'foo bar baz'

This is done by converting pattern foo bar baz to foo\s+bar\s+baz, so that word separator can match one or more white spaces.

As for Asian wide characters, pattern is cooked as zero or more white spaces can be allowed between any characters. So Japanese string pattern 日本語 will be converted to 日\s*本\s*語.

If you don't want these conversion, use -E (or --re) option.

Related options: --inside/--outside/--include/--exclude ("REGIONS"), --block ("BLOCKS")

STYLES

Related options: --block/-p ("BLOCKS"), --color/--colormap ("COLORS")

FILES

COLORS

Related options: -o ("STYLES"), --inside/--outside/--include/--exclude ("REGIONS")

BLOCKS

Related options: -b/--block-number ("STYLES"), -A/-B/-C ("STYLES"), --inside/--outside/--include/--exclude ("REGIONS")

REGIONS

Related options: --block ("BLOCKS"), --regioncolor ("COLORS"), -e/-v ("PATTERNS")

CHARACTER CODE

FILTER

RUNTIME FUNCTIONS

For these run-time functions, optional argument list can be set in the form of key or key=value, connected by comma. These arguments will be passed to the function in key => value list. Sole key will have the value one. Also processing file name is passed with the key of FILELABEL constant. As a result, the option in the next form:

--begin function(key1,key2=val2)
--begin function=key1,key2=val2

will be transformed into following function call:

function(&FILELABEL => "filename", key1 => 1, key2 => "val2")

As described earlier, FILELABEL parameter is not given to the function specified with module option. So

-Mmodule::function(key1,key2=val2)
-Mmodule::function=key1,key2=val2

simply becomes:

function(key1 => 1, key2 => "val2")

The function can be defined in .greplerc or modules. Assign the arguments into hash, then you can access argument list as member of the hash. It's safe to delete FILELABEL key if you expect random parameter is given. Content of the target file can be accessed by $_. Ampersand (&) is required to avoid the hash key is interpreted as a bare word.

sub function {
    my %arg = @_;
    my $filename = delete $arg{&FILELABEL};
    $arg{key1};             # 1
    $arg{key2};             # "val2"
    $_;                     # contents
}

OTHERS

ENVIRONMENT and STARTUP FILE

Before starting execution, greple reads the file named .greplerc on user's home directory. Following directives can be used.

Environment variable substitution is done for string specified by option and define directives. Use Perl syntax $ENV{NAME} for this purpose. You can use this to make a portable module.

When greple found __PERL__ line in .greplerc file, the rest of the file is evaluated as a Perl program. You can define your own subroutines which can be used by --inside/--outside, --include/--exclude, --block options.

For those subroutines, file content will be provided by global variable $_. Expected response from the subroutine is the list of array references, which is made up by start and end offset pairs.

For example, suppose that the following function is defined in your .greplerc file. Start and end offset for each pattern match can be taken as array element $-[0] and $+[0].

__PERL__
sub odd_line {
    my @list;
    my $i;
    while (/.*\n/g) {
        push(@list, [ $-[0], $+[0] ]) if ++$i % 2;
    }
    @list;
}

You can use next command to search pattern included in odd number lines.

% greple --inside '&odd_line' pattern files...

MODULE

You can expand the greple command using module. Module files are placed at App/Greple/ directory in Perl library, and therefor has App::Greple::module package name.

In the command line, module have to be specified preceding any other options in the form of -Mmodule. However, it also can be specified at the beginning of option expansion.

If the package name is declared properly, __DATA__ section in the module file will be interpreted same as .greplerc file content. So you can declare the module specific options there. Functions declared in the module can be used from those options, it makes highly expandable option/programming interaction possible.

Using -M without module argument will print available module list. Option --man will display module document when used with -M option. Use --show option to see the module itself. Option --path will print the path of module file.

See this sample module code. This sample defines options to search from pod, comment and other segment in Perl script. Those capability can be implemented both in function and macro.

package App::Greple::perl;

use Exporter 'import';
our @EXPORT      = qw(pod comment podcomment);
our %EXPORT_TAGS = ( );
our @EXPORT_OK   = qw();

use App::Greple::Common;
use App::Greple::Regions;

my $pod_re = qr{^=\w+(?s:.*?)(?:\Z|^=cut\s*\n)}m;
my $comment_re = qr{^(?:\h*#.*\n)+}m;

sub pod {
    match_regions(pattern => $pod_re);
}
sub comment {
    match_regions(pattern => $comment_re);
}
sub podcomment {
    match_regions(pattern => qr/$pod_re|$comment_re/);
}

1;

__DATA__

define :comment: ^(\s*#.*\n)+
define :pod: ^=(?s:.*?)(?:\Z|^=cut\s*\n)

#option --pod --inside :pod:
#option --comment --inside :comment:
#option --code --outside :pod:|:comment:

option --pod --inside '&pod'
option --comment --inside '&comment'
option --code --outside '&podcomment'

You can use the module like this:

greple -Mperl --pod default greple

greple -Mperl --colorful --code --comment --pod default greple

If special subroutine initialize() and finalize() are defined in the module, they are called at the beginning with Getopt::EX::Module object as a first argument. Second argument is the reference to @ARGV, and you can modify actual @ARGV using it. See App::Greple::find module as an example.

Calling sequence is like this. See Getopt::EX::Module for detail.

1) Call initialize()
2) Call function given in -Mmod::func() style
3) Call finalize()

HISTORY

Most capability of greple is derived from mg command, which has been developing from early 1990's by the same author. Because modern standard grep family command becomes to have similar capabilities, it is a time to clean up entire functionalities, totally remodel the option interfaces, and change the command name. (2013.11)

SEE ALSO

grep(1), perl(1)

App::Greple, App::Greple::Grep

https://github.com/kaz-utashiro/greple

Getopt::EX, https://github.com/kaz-utashiro/Getopt-EX

AUTHOR

Kazumasa Utashiro

LICENSE

Copyright 1991-2026 Kazumasa Utashiro

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.