bin/sets - metacpan.org


            
              1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
—
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
              #!/usr/bin/env perl
# PODNAME: sets
# ABSTRACT: set operations in Perl
use Log::Log4perl::Tiny qw< :easy >;
Log::Log4perl->easy_init(
   {
      layout => '[%d] [%-5p] %m%n',
      level  => $INFO,
   }
);
use App::Sets;
App::Sets->run(@ARGV);
__END__
=pod
=head1 NAME
sets - set operations in Perl
=head1 VERSION
version 0.978
=pod
=encoding utf-8
=head1 USAGE
   sets [--usage] [--help] [--man] [--version]
   sets [--binmode|-b <string>]
        [--cache-sorted|-S <suffix>]
        [--internal-sort|-I]
        [--loglevel|-l <level>]
        [--sorted|-s]
        [--trim|-t]
        expression...
=head1 EXAMPLES
   # intersect two files
   sets file1 ^ file2
   # things are speedier when files are sorted
   sets -s sorted-file1 ^ sorted-file2
   # you can use a bit caching in case, generating sorted files
   # automatically for possible multiple or later reuse. For example,
   # the following is the symmetric difference where the sorting of
   # the input files will be performed two times only
   sets -S .sorted '(file1 - file2) + (file2 - file1)'
   # In the example above, note that expressions with grouping need to be
   # specified in a single string.
   # sometimes leading and trailing whitespaces only lead to trouble, so
   # you can trim data on-the-fly
   sets -t file1-unix - file2-dos
=head1 DESCRIPTION
This program lets you perform set operations working on input files.
The set operations that can be performed are the following:
=over
=item B<< intersection >>
the binary operation that selects all the elements that are in both
the left and the right hand operand. This operation can be specified with
any of the following operators:
=over
=item B<< intersect >>
=item B<< i >>
=item B<< I >>
=item B<< & >>
=item B<< ^ >>
=back
=item B<< union >>
the binary operation that selects all the elements that are in either
the left or the right hand operand. This operation can be specified with
any of the following operators:
=over
=item B<< union >>
=item B<< u >>
=item B<< U >>
=item B<< v >>
=item B<< V >>
=item B<< | >>
=item B<< + >>
=back
=item B<< difference >>
the binary operation that selects all the elements that are in
the left but not in the right hand operand. This operation can be
specified with any of the following operators:
=over
=item B<< minus >>
=item B<< less >>
=item B<< \ >>
=item B<< - >>
=back
=back
Expressions can be grouped with parentheses, so that you can set the
precedence of the operations and create complex aggregations. For
example, the following expression computes the symmetric difference
between the two sets:
   (set1 - set2) + (set2 - set1)
Expressions should be normally entered as a single string that is then
parsed. In case of I<simple> operations (e.g. one operation on two
sets) you can also provide multiple arguments. In other terms, the
following invocations should be equivalent:
   sets 'set1 - set2'
   sets set1 - set2
Options can be specified only as the first parameters. If your first
set begins with a dash, use a double dash to explicitly terminate the
list of options, e.g.:
   sets -- -first-set ^ -second-set
In general, anyway, the first non-option argument terminates the list
of options as well, so the example above would work also without the
C<-->. In the pathological case that your file is named C<-s>, anyway,
you would need the explicit termination of options with C<-->. You get
the idea.
Files with spaces and other weird stuff can be specified by means
of quotes or escapes. The following are all valid methods of subtracting
C<to remove> from C<input file>:
   sets "'input file' - 'to remove'"
   sets '"input file" - "to remove"'
   sets 'input\ file - to\ remove'
   sets "input\\ file - to\\ remove"
   sets input\ file - to\ remove
The first two examples use single and double quoting. The third example
uses a backslash to escape the spaces, as well as the fourth example in
which the escape character is repeated due to the interpolation rules
of the shell. The last example leverages upon the shell rules for
escaping AND the fact that simple expressions like that can be specified
as multiple arguments instead of a single string.
=head1 OPTIONS
=over
=item --binmode | -b I<string>
set a string for calling C<binmode> on STDOUT. By default,
C<:raw:encoding(UTF-8)> is set, to normalize newlines handling and
expect UTF-8 data in.
=item --cache-sorted | -S I<suffix>
input files are sorted and saved into a file with the same name and the
I<suffix> appended, so that if this file exists it is used instead of
the input file. In this way it is possible to generate sorted files on
the fly and reuse them if available. For example, suppose that you want
to remove the items in C<removeme> from files C<file1> and C<file2>; in
the following invocations:
   sets file1 - removeme > file1.filtered
   sets file2 - removeme > file2.filtered
we have that file C<removeme> would be sorted in both calls, while in the
following ones:
   sets -S .sorted file1 - removeme > file1.filtered
   sets -S .sorted file2 - removeme > file2.filtered
it would be sorted only in the first call, that generates C<removeme.sorted>
that is then reused by the second call. Of course you're trading disk space
for speed here, but most of the times it is exactly what you want to do when
you have disk space but little time to wait. This means that most of the
times you'll e wanting to use this option, I<unless> you're willing to wait
more or you already know that input files are sorted (in which case you would
use C<< --sorted | -s >> instead).
=item --help
print a somewhat more verbose help, showing usage, this description of
the options and some examples from the synopsis.
=item --internal-sort | -I
force using the internal sorting facility even if external C<sort> is
available. Some rough benchmark showed that this is slower about 7%
with respect to using the external C<sort>, so avoid this if you can.
=item --loglevel | -l I<level>
set the verbosity of the logging subsystem. Allowed values (in increasing
verbosity): C<TRACE>, C<DEBUG>, C<INFO>, C<WARN>, C<ERROR> and C<FATAL>.
=item --man
print out the full documentation for the script.
=item --sorted | -s
in normal mode, input files are sorted on the fly before being used. If you
know that I<all> your input files are already sorted, you can spare the
extra sorting operation by using this option:
   sets -s file1.sorted ^ file2.sorted
=item --trim | -t
if you happen to have leading and/or trailing white spaces (including
tabs, carriage returns, etc.) that you want to get rid of, you can turn
this option on. This is particularly useful if some files come from the
UNIX world and other ones from the DOS world, becaue they have different
ideas about terminating a line.
=item --usage
print a concise usage line and exit.
=item --version
print the version of the script.
=back
=head1 ENVIRONMENT
Some options can be set from the environment:
=over
=item C<SETS_CACHE>
the same as specifying C<< --cache-sorted | -S I<suffix> >> on the
command line. The contents of C<SETS_CACHE> is used as the I<suffix>.
=item C<SETS_INTERNAL_SORT>
the same as specifying C<< --internal-sort | -I >> on the command line.
=item C<SETS_MAX_FILES>
maximum number of (temporary) files to keep when using the internal sorting
facility.
=item C<SETS_MAX_RECORDS>
maximum number of input records to keep in memory when using the internal
sorting facility.
=item C<SETS_SORTED>
the same as specifying C<--sorted | -s> on the command line
=item C<SETS_TRIM>
the same as specifying C<--trim | -t> on the command line
=back
=cut
=head1 AUTHOR
Flavio Poletti <polettix@cpan.org>
=head1 COPYRIGHT AND LICENSE
Copyright (C) 2011-2016 by Flavio Poletti polettix@cpan.org.
This module is free software.  You can redistribute it and/or
modify it under the terms of the Artistic License 2.0.
This program is distributed in the hope that it will be useful,
but without any warranty; without even the implied warranty of
merchantability or fitness for a particular purpose.
=cut
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)