lib/File/Globstar/ListMatch.pod


            
              —
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
              
=for Pod::Coverage empty
=head1 NAME
File::Globstar::ListMatch - Match File Names Against List Of Patterns
=head1 SYNOPSIS
    use File::Globstar::ListMatch;
    # Parse from file.
    $matcher = File::Globstar::ListMatch->new('.gitignore',
                                              ignoreCase => 1,
                                              isExclude => 1);
    # Parse from file handle.
    $matcher = File::Globstar::ListMatch->new(STDIN, ignoreCase => 0);
    # Parse list of patterns.  Comments and blank lines are not
    # stripped!
    $matcher = File::Globstar::ListMatch->new([
        'src/**/*.o',
        '.*',
        '!.gitignore'
    ], filename => 'exclude.txt');
    # Parse string.
    $patterns = <<EOF;
    # Ignore all compiled object files.
    src/**/*.o
    # Ignore all hidden files.
    .*
    # But not this one.
    .gitignore
    EOF
    $matcher = File::Globstar::ListMatch->new(\$pattern);
    $filename = 'path/to/hello.o';
    if ($matcher->match($filename)) {
        print "Ignore '$filename'.\n";
    }
=head1 DESCRIPTION
Files containing lists of files are very common as arguments to command-line
options such as "--ignore", "--exclude" or "--include".  A well-known example
is the syntax used by L<git|https://git-scm.com/> as described in
L<https://git-scm.com/docs/gitignore>.
This module implements the same functionality in Perl.
While the module will normally be used for matching against filenames, no
filesystem operations are done.  Only strings are compared.  The reason for
this is that it should be possible to match names of deleted files as
well as existing files.
When you read the documentation below, you may come to the conclusion that
using this module is really complicated given that there are so many special
rules about slashes, exclamation marks, asterisks/stars and so on.  In fact,
these rules are only hard to describe precisely.  But they pretty much do
exactly what you expect from them and what you are used to from the moment
that you had first entered a "*" in a terminal window.
That being said, do not spend too much time thinking what the difference
between "/src/backup" and "src/backup" in a F<.gitignore> file is.  Just try
it out and leave it as it is, once it works.  Only if you want to know
why "/src/backup" and "src/backup" do exactly the same, you can use this
documentation as a reference.
Similarily, the description of the "ignore mode" and "include mode" below
sounds pretty convoluted.  But the mere presence of a the word "ignore" in the
name of the F<.gitignore> file switches your brain to "ignore mode" and the
subtle difference between the two modes just reflect what your brain expects.
=head2 INPUT FILE FORMAT
Unless you pass a reference to an array of patterns as input, comments and
blank lines are discarded.
=over 4
=item B<Comments>
Comments are lines that start with a hash-sign.  You can escape hash-signs that
are part of the pattern with a backslash: "\#".
=item B<Blank Lines>
Blank lines are empty lines or lines consisting of whitespace only.  Whitespace
is a sequence of US-ASCII whitespace characters:  horizontal tabs
(ASCII 9), line  feeds (ASCII 10), vertical tabs (ASCII 11), form feeds
(ASCII 12), carriage returns (ASCII 13), and space (ASCII 32).  Other
characters with the Unicode property "WSpace=Y" are not considered whitespace.
=back
Leading whitespace (see above) is I<not> removed!  Likewise, whitespace between
the leading negation "!" and the pattern is not removed!  On the other hand,
in order to ignore a file named "   " you have to backslash-escape
the first space character ("\   ").  Otherwise, the pattern will be interpreted
as a blank line and ignored.  This is consistent with the behavior of Git.
=head2 PATTERNS
Patterns undergo a little preprocessing:
=over 4
=item B<A leading exclamation is stripped off>
But the pattern is now negated.  It produces a match for all files that do
I<not> match the pattern.  A literal exclamation mark can be escaped with
a backslash.
=item B<A leading slash is stripped off>
But the pattern must match the entire significant "path" (actually string).
For example "/foobar" matches for "/foobar" but not for "/sub/sub/foobar".
=item B<A trailing slash is stripped off>
But the pattern can only match "directories".  This is why the method
B<match()> below has an optional second argument that lets you specify,
whether the string to be matched is considered a directory or not.
=back
=head2 MATCHING ALGORITHM
The string (normally a filename) passed to the matcher is compared
subsequently to all patterns.  If none of the patterns match or if the last
match was against a negated pattern, the overall result is false.  Otherwise it
is true.
If a patterrn contains a slash ("/"), the match is performed against the full
path name, otherwise just against the basename of the file
with a leading "directory" part stripped off.
If a pattern starts with a leading slash, that slash is stripped off for
the purpose of comparison but the string must match relatively to the
base "directory".
The semantics of a match are the same as for B<fnmatchstar()> in
L<File::Globstar>.  But keep in mind that a leading exclamation mark (for
negation), a leading "directory" part, a leading slash, or a trailing slash
may be stripped off according to the rules outline above.
=head1 METHODS
=over 4
=item B<new INPUT[, %OPTIONS]>
Creates a new B<File::Globstar::ListMatch> object.  B<INPUT> can be:
=over 8
=item B<FILE>
B<FILE> can be a filename, an open file handle, or a reference to an open
file handle.
=item B<STRINGREF>
B<STRINGREF> is a reference to a string containing the patterns.
=item B<ARRAYREF>
You can also pass a list of patterns as an array reference.  Leading
exclamation marks ("!") followed by possible whitespace for negating patterns
are stripped off, but otherwise all patterns are taken as is, even blank
ones and patterns starting with a hash-sign ("#").
=back
The input source can be followed by optional named arguments passed as
key-value pairs.  Currently recogized:
=over 4
=item B<ignoreCase =E<gt> 0|1|undef>
Controls, whether to ignore case, when matching.  A true value will cause
case to be ignored.  The default value is false, so that matches are done
in a case-sensitive manner.  This is an appropriate setting also for both
case-sensitive and case-preserving file systems.
=item B<filename =E<gt> FILENAME>
Use B<FILENAME> in messages for I/O errors.
=back
=item B<match STRING[, IS_DIRECTORY]>
Returns true if B<STRING> matches.  If you pass a true value for the
optional argument B<IS_DIRECTORY>, B<STRING> is considered to be the name
of a directory.
A leading slash in B<STRING> is stripped off and is ignored.
A trailing slash is also ignored but B<STRING> is then considered to
be a directory name and B<IS_DIRECTORY> is ignored.
When comparing against a pattern that contains a slash (except for a trailing
slash), the full string is taken into account.  Otherwise, only the part
after the last (non-trailing) slash is taken into account.
Note that B<match()> assumes that you are excluding or ignoring files.  This
I<exclude> mode implies that certain negations do not make sense and are
ignored.  Thake this example:
     /node_modules
     !/node_modules/foobar
The second line gets ignored.  In exclude mode it is assumed that you
recurse a directory calling B<match()> for every file you visit.  If a file
matches it is always ignored.  And if it is a directory, it is not only
ignored but the recursion also stops here.
The exact rule is: You cannot re-include a file by negating a pattern if one
of the file's parent directories would be excluded.
This is the  behaviour of git, when evaluating ignore lists.  If you want to
avoid that behaviour, see below for B<matchInclude()>.
On the other hand, the following is possible:
    docs/_*
    !docs/_posts
The only "parent directory" for the negation is F<docs> and that is not
excluded.  This is different to this example:
    docs/_*
    !docs/_posts/recent
The "parent directories" are F<docs> and F<docs/_posts> and F<docs/_posts>
matches against F<docs/_*>.  The negation is therefore invalid and ignored.
=item B<matchExclude STRING[, IS_DIRECTORY]>
This is an alias for B<match()>
=item B<matchInclude STRING[, IS_DIRECTORY]>
Does the same as B<match()> but all negations are valid.
The metaphor for include mode is different from exclude mode that is
assumed by B<match()> resp. B<matchExclude()>.  In include mode you
interpret all patterns as globbing patterns.  A positive, non-negated
pattern causes the matching files to be added to the result list.  A
negated pattern will remove the matching files from the result list.  The
operation happens recursively if a directory gets added or removed.
Take almost the same example as for B<match()> above:
    docs/_*
    !docs/_posts/archive
Imagine line 1 would produce the following list:
=over 4
=item * F<docs/_views/>
=item * F<docs/_views/main.html>
=item * F<docs/_views/head/>
=item * F<docs/_views/head/meta.html>
=item * F<docs/_posts/new/>
=item * F<docs/_posts/new/post4321.html>
=item * F<docs/_posts/archive/>
=item * F<docs/_posts/archive/post1.html>
=item * F<docs/_posts/archive/post2.html>
Line 2 would then kickout the last three matches from the result list.
=back
=item B<patterns>
Returns the patterns as a list of compiled regular expressions.  This is
probably only useful for testing the module itself.
=back
=head1 BUGS AND CAVEATS
The module interprets backslashes only for escaping.  It does not assume any
path separator semantics for them.  This should normally not be a problem.
Git ignores all hidden files by default.  If you want the same behavior for
B<File::Globstar::ListMatch>, put a ".*" in front of the patterns.
=head1 COPYRIGHT
Copyright (C) 2016-2023 Guido Flohr <guido.flohr@cantanea.com>,
all rights reserved.
=head1 SEE ALSO
L<File::Globstar>, L<File::Glob>, glob(3), glob(7), fnmatch(3), glob(1),
perl(1)
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)