lib/Perl6/Bible/A03.pod


            
              1
—
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
              
=head1 TITLE
Apocalypse 3: Operators
=head1 AUTHOR
Larry Wall <larry@wall.org>
=head1 VERSION
  Maintainer: Larry Wall <larry@wall.org>
  Date: 2 Oct 2001
  Last Modified: 24 Sep 2004
  Number: 3
  Version: 2
To me, one of the most agonizing aspects of language design is coming
up with a useful system of operators. To other language designers, this
may seem like a silly thing to agonize over. After all, you can view
all operators as mere syntactic sugar -- operators are just funny
looking function calls. Some languages make a feature of leveling all
function calls into one syntax. As a result, the so-called functional
languages tend to wear out your parenthesis keys, while OO languages
tend to wear out your dot key.
But while your computer really likes it when everything looks the same,
most people don't think like computers. People prefer different things
to look different. They also prefer to have shortcuts for common tasks.
(Even the mathematicians don't go for complete orthogonality. Many of
the shortcuts we typically use for operators were, in fact, invented by
mathematicians in the first place.)
So let me enumerate some of the principles that I weigh against each
other when designing a system of operators.
=over
=item * Different classes of operators should look different. That's
why filetest operators look different from string or numeric operators.
=item * Similar classes of operators should look similar. That's why
the filetest operators look like each other.
=item * Common operations should be "Huffman coded." That is,
frequently used operators should be shorter than infrequently used
ones. For how often it's used, the C<scalar> operator of Perl 5 is too
long, in my estimation.
=item * Preserving your culture is important. So Perl borrowed many of
its operators from other familiar languages. For instance, we used
Fortran's C<**> operator for exponentiation. As we go on to Perl 6,
most of the operators will be "borrowed" directly from Perl 5.
=item * Breaking out of your culture is also important, because that is
how we understand other cultures. As an explicitly multicultural
language, Perl has generally done OK in this area, though we can always
do better. Examples of cross-cultural exchange among computer cultures
include XML and Unicode. (Not surprisingly, these features also enable
better cross-cultural exchange among human cultures -- we sincerely
hope.)
=item * Sometimes operators should respond to their context. Perl has
many operators that do different but related things in scalar versus
list context.
=item * Sometimes operators should propagate context to their
arguments. The C<x> operator currently does this for its left argument,
while the short-circuit operators do this for their right argument.
=item * Sometimes operators should force context on their arguments.
Historically, the scalar mathematical operators of Perl have forced
scalar context on their arguments. One of the RFCs discussed below
proposes to revise this.
=item * Sometimes operators should respond polymorphically to the types
of their arguments. Method calls and overloading work this way.
=item * Operator precedence should be designed to minimize the need for
parentheses. You can think of the precedence of operators as a partial
ordering of the operators such that it minimizes the number of
"unnatural" pairings that require parentheses in typical code.
=item * Operator precedence should be as simple as possible. Perl's
precedence table currently has 24 levels in it. This might or might not
be too many. We could probably reduce it to about 18 levels, if we
abandon strict C compatibility of the C-like operators.
=item * People don't actually want to think about precedence much, so
precedence should be designed to match expectations. Unfortunately, the
expectations of someone who knows the precedence table won't match the
expectations of someone who doesn't. And Perl has always catered to the
expectations of C programmers, at least up till now. There's not much
one can do up front about differing cultural expectations.
=back
It would be easy to drive any one of these principles into the ground,
at the expense of other principles. In fact, various languages have
done precisely that.
My overriding design principle has always been that the complexity of
the solution space should map well onto the complexity of the problem
space. Simplification good! Oversimplification bad! Placing artificial
constraints on the solution space produces an impedence mismatch with
the problem space, with the result that using a language that is
artificially simple induces artificial complexity in all solutions
written in that language.
One artificial constraint that all computer languages must deal with is
the number of symbols available on the keyboard, corresponding roughly
to the number of symbols in ASCII. Most computer languages have
compensated by defining systems of operators that include digraphs,
trigraphs, and worse. This works pretty well, up to a point. But it
means that certain common unary operators cannot be used as the end of
a digraph operator. Early versions of C had assignment operators in the
wrong order. For instance, there used to be a C<=-> operator. Nowadays
that's spelled C<-=>, to avoid conflict with unary minus.
By the same token (no pun intended), you can't easily define a unary
C<=> operator without requiring a space before it most of the time,
since so many binary operators end with the C<=> character.
Perl gets around some of these problems by keeping track of whether it
is expecting an operator or a term. As it happens, a unary operator is
simply one that occurs when Perl is expecting a term. So Perl could
keep track of a unary C<=> operator, even if the human programmer might
be confused. So I'd place a unary C<=> operator in the category of
"OK, but don't use it for anything that will cause widespread
confusion." Mind you, I'm not proposing a specific use for a unary
C<=> at this point. I'm just telling you how I think. If we ever do get
a unary C<=> operator, we will hopefully have taken these issues into
account.
While we can disambiguate operators based on whether an operator or a
term is expected, this implies some syntactic constraints as well. For
instance, you can't use the same symbol for both a postfix operator and
a binary operator. So you'll never see a binary C<++> operator in Perl,
because Perl wouldn't know whether to expect a term or operator after
that. It also implies that we can't use the "juxtaposition" operator.
That is, you can't just put two terms next to each other, and expect
something to happen (such as string concatenation, as in I<awk>). What
if the second term started with something looked like an operator? It
would be misconstrued as a binary operator.
Well, enough of these vague generalities. On to the vague specifics.
The RFCs for this apocalypse are (as usual) all over the map, but don't
cover the map. I'll talk first about what the RFCs do cover, and then
about what they don't. Here are the RFCs that happened to get
themselves classified into chapter 3:
    RFC   PSA    Title
    ---   ---    -----
    024   rr     Data types: Semi-finite (lazy) lists
    025   dba    Operators: Multiway comparisons
    039   rr     Perl should have a print operator 
    045   bbb    C<||> and C<&&> should propagate result context to both sides
    054   cdr    Operators: Polymorphic comparisons
    081   abc    Lazily evaluated list generation functions
    082   abc    Arrays: Apply operators element-wise in a list context
    084   abb    Replace => (stringifying comma) with => (pair constructor)
    104   ccr    Backtracking
    138   rr     Eliminate =~ operator.
    143   dcr    Case ignoring eq and cmp operators
    170   ccr    Generalize =~ to a special "apply-to" assignment operator
    283   ccc    C<tr///> in array context should return a histogram
    285   acb    Lazy Input / Context-sensitive Input
    290   bbc    Better english names for -X
    320   ccc    Allow grouping of -X file tests and add C<filetest> builtin
Note that you can click on the following RFC titles to view a copy of
the RFC in question. The discussion sometimes assumes that you've read
the RFC.
=head2 RFC 025: Operators: Multiway comparisons
This RFC proposes that expressions involving multiple chained
comparisons should act like mathematician would expect. That is, if you
say this:
    0 <= $x < 10
it really means something like:
    0 <= $x && $x < 10
The C<$x> would only be evaluated once, however. (This is very much
like the rewrite rule we use to explain assignment operators such as
C<$x += 3>.)
I started with this RFC simply because it's not of any earthshaking
importance whether I accept it or not. The tradeoff is whether to put
some slight complexity into the grammar in order to save some slight
complexity in some Perl programs. The complexity in the grammar is not
much of a problem here, since it's amortized over all possible uses of
it, and it already matches the known psychology of a great number of
people.
There is a potential interaction with precedence levels, however. If we
choose to allow an expression like:
    0 <= $x == $y < 20
then we'll have to unify the precedence levels of the comparison
operators with the equality operators. I don't see a great problem with
this, since the main reason for having them different was (I believe)
so that you could write an exclusive of two comparisons, like this:
    $x < 10 != $y < 10
However, Perl has a built-in C<xor> operator, so this isn't really much
of an issue. And there's a lot to be said for forcing parentheses in
that last expression anyway, just for clarity. So unless anyone comes
up with a large objection that I'm not seeing, this RFC is accepted.
=head2 RFC 320: Allow grouping of -X file tests and add C<filetest>
builtin
This RFC proposes to allow clustering of file test operators much like
some Unix utilities allow bundling of single character switches. That
is, if you say:
    -drwx $file
it really means something like:
    -d $file && -r $file && -w $file && -x $file
Unfortunately, as proposed, this syntax will simply be too confusing.
We have to be able to negate named operators and subroutines. The
proposed workaround of putting a space after a unary minus is much too
onerous and counterintuitive, or at least countercultural.
The only way to rescue the proposal would be to say that such operators
are autoloaded in some fashion; any negated but I<unrecognized>
operator would then be assumed to be a clustered filetest. This would
be risky in that it would prevent Perl from catching misspelled
subroutine names at compile time when negated, and the error might well
not get caught at run time either, if all the characters in the name
are valid filetests, and if the argument can be interpreted as a
filename or filehandle (which is usually). Perhaps it would be
naturally disallowed under C<use strict>, since we'd basically be
treating C<-xyz> as a bareword. On the other hand, in Perl 5, I<all>
method names are essentially in the unrecognized category until run
time, so it would be impossible to tell whether to parse the minus sign
as a real negation. Optional type declarations in Perl 6 would only
help the compiler with variables that are actually declared to have a
type. Fortunately, a negated 1 is still true, so even if we parsed the
negation as a real negation, it might still end up doing the right
thing. But it's all very tacky.
So I'm thinking of a different tack. Instead of bundling the letters:
    -drwx $file
let's think about the trick of returning the value of C<$file> for a
true value. Then we'd write nested unary operators like this:
    -d -r -w -x $file
One tricky thing about that is that the operators are applied right to
left. And they don't really short circuit the way stacked C<&&> would
(though the optimizer could probably fix that). So I expect we could do
this for the default, and if you want the C<-drwx> as an autoloaded
backstop, you can explicitly declare that.
In any event, the proposed C<filetest> built-in need not be built in.
It can just be a universal method. (Or maybe just common to strings and
filehandles?)
My one hesitation in making cascading operators work like that is that
people might be tempted to get cute with the returned filename:
    $handle = open -r -w -x $file or die;
That might be terribly confusing to a lot of people. The solution to
this conundrum is presented at the end of the next section.
=head2 RFC 290: Better english names for -X
This RFC proposes long names as aliases for the various filetest
operators, so that instead of saying:
    -r $file
you might say something like:
    use english;
    freadable($file)
Actually, there's no need for the C<use english>, I expect. These names
could merely universal (or nearly universal) methods. In any case, we
should start getting used to the idea that C<mumble($foo)> is
equivalent to C<$foo.mumble()>, at least in the absence of a local
subroutine definition to the contrary. So I expect that we'll see both:
    is_readable($file)
and:
    $file.is_readable
Similar to the cascaded filetest ops in the previous section, one
approach might be that the boolean methods return the object in
question for success so that method calls could be stacked without
repeating the object:
    if ($file.is_dir
             .is_readable
             .is_writable
             .is_executable) {
But C<-drwx $file> could still be construed as more readable, for some
definition of readability. And cascading methods aren't really
short-circuited. Plus, the value returned would have to be something
like "$file is true," to prevent confusion over filename "0."
There is also the question of whether this really saves us anything
other than a little notational convenience. If each of those methods
has to do a I<stat> on the filename, it will be rather slow. To fix
that, what we'd actually have to return would be not the filename, but
some object containing the stat buffer (represented in Perl 5 by the
C<_> character). If we did that, we wouldn't have to play C<$file is
true> games, because a valid stat buffer object would (presumably)
always be true (at least until it's false).
The same argument would apply to cascaded filetest operators we talked
about earlier. An autoloaded C<-drwx> handler would presumably be smart
enough to do a single stat. But we'd likely lose the speed gain by
invoking the autoload mechanism. So cascaded operators (either C<-X>
style or C<.is_XXX> style) are the way to go. They just return objects
that know how to be either boolean or stat buffer objects in context.
This implies you could even say
    $statbuf = -f $file or die "Not a regular file: $file";
    if (-r -w $statbuf) { ... }
This allows us to simplify the special case in Perl 5 represented by
the C<_> token, which was always rather difficult to explain. And
returning a stat buffer instead of C<$file> prevents the confusing:
    $handle = open -r -w -x $file or die;
Unless, of course, we decide to make a stat buffer object return the
filename in a string context. C<:-)>
=head2 RFC 283: C<tr///> in array context should return a histogram
Yes, but ...
While it's true that I put that item into the Todo list ages ago, I
think that histograms should probably have their own interface, since
the histogram should probably be returned as a complete hash in scalar
context, but we can't guess that they'll want a histogram for an
ordinary scalar C<tr///>. On the other hand, it could just be a C</h>
modifier. But we've already done violence to C<tr///> to make it do
character counting without transliterating, so maybe this isn't so far
fetched.
One problem with this RFC is that it does the histogram over the input
rather than the output string. The original Todo entry did not specify
this, but it was what I had intended. But it's more useful to do it on
the resulting characters because then you can use the C<tr///> itself
to categorize characters into, say, vowels and consonants, and then
count the resulting V's and C's.
On the other hand, I'm thinking that the C<tr///> interface is really
rather lousy, and getting lousier every day. The whole C<tr///>
interface is kind of sucky for any sort of dynamically generated data.
But even without dynamic data, there are serious problems. It was bad
enough when the character set was just ASCII. The basic problem is that
the notation is inside out from what it should be, in the sense that it
doesn't actually show which characters correspond, so you have to count
characters. We made some progress on that in Perl 5 when, instead of:
    tr/abcdefghijklmnopqrstuvwxyz/VCCCVCCCVCCCCCVCCCCCVCCCCC/
we allowed you to say:
    tr[abcdefghijklmnopqrstuvwxyz]
      [VCCCVCCCVCCCCCVCCCCCVCCCCC]
There are also shenanigans you can play if you know that duplicates on
the left side prefer the first mention to subsequent mentions:
    tr/aeioua-z/VVVVVC/
But you're still working against the notation. We need a more explicit
way to put character classes into correspondence.
More problems show up when we extend the character set beyond ASCII.
The use of C<tr///> for case translations has long been
semi-deprecated, because a range like C<tr/a-z/A-Z/> leaves out
characters with diacritics. And now with Unicode, the whole notion of
what is a character is becoming more susceptible to interpretation, and
the C<tr///> interface doesn't tell Perl whether to treat character
modifiers as part of the base character. For some of the double-wide
characters it's even hard to just I<look> at the character and tell if
it's one character or two. Counted character lists are about as modern
as hollerith strings in Fortran.
So I suspect the C<tr///> syntax will be relegated to being just one
quote-like interface to the actual transliteration module, whose main
interface will be specified in terms of translation pairs, the left
side of which will give a pattern to match (typically a character
class), and the right side will say what to translation anything
matching to. Think of it as a series of coordinated parallel C<s///>
operations. Syntax is still open for negotiation till apocalypse 5.
But there can certainly be a histogram option in there somewhere.
=head2 RFC 084: Replace C<< => >> (stringifying comma) with C<< => >>
(pair constructor)
I like the basic idea of pairs because it generalizes to more than just
hash values. Named parameters will almost certainly be implemented
using pairs as well.
I do have some quibbles with the RFC. The proposed C<key> and C<value>
built-ins should simply be lvalue methods on pair objects. And if we
use pair objects to implement entries in hashes, the key must be
immutable, or there must be some way of re-hashing the key if it
changes.
The stuff about using pairs for mumble-but-false is bogus. We'll use
properties for that sort of chicanery. (And multiway comparisons won't
rely on such chicanery in any event. See above.)
=head2 RFC 081: Lazily evaluated list generation functions
Sorry, you can't have the colon--at least, not without sharing it.
Colon will be a kind of "supercomma" that supplies an adverbial list
to some previous operator, which in this case would be the prior colon
or dotdot.
(We can't quite implement C<?:> as a C<:> modifier on C<?>, because the
precedence would be screwey, unless we limit C<:> to a single argument,
which would preclude its being used to disambiguate indirect objects.
More on that later.)
The RFCs proposal concerning C<attributes::get(@a)> stuff is superseded
by value properties. So, C<@a.method()> should just pull out the
variable's properties directly, if the variable is of a type that
supports the methods in question. A lazy list object should certainly
have such methods.
Assignment of a lazy list to a tied array is a problem unless the tie
implementation handles laziness. By default a tied array is likely to
enforce immediate list evaluation. Immediate list evaluation doesn't
work on infinite lists. That means it's gonna fill up your disk drive
if you try to say something like:
    @my_tied_file = 1..Inf;
Laziness should be possible, but not necessarily the norm. It's all
very well to delay the evaluation of "pure" functions in the realm of
math, since presumably you get the same result no matter when you
evaluate. But a lot of Perl programming is done with real world data
that changes over time. Saying C<somefunc($a .. $b)> can get terribly
fouled up if C<$b> can change, and the lazy function still refers to
the variable rather than its instantaneous value. On the other hand,
there is overhead in taking snapshots of the current state.
On the gripping hand, the lazy list object I<is> the snapshot of the
values, that's not a problem in this case. Forget I mentioned it.
The tricky thing about lazy lists is not the lazy lists themselves, but
how they interact with the rest of the language. For instance, what
happens if you say:
    @lazy = 1..Inf;
    @lazy[5] = 42;
Is C<@lazy> still lazy after it is modified? Do we remember the
C<@lazy[5]> is an "exception", and continue to generate the rest of
the values by the original rule? What if C<@lazy> is going to be
generated by a recursive function? Does it matter whether we've already
generated C<@lazy[5]>?
And how do we explain this simply to people so that they can
understand? We will have to be very clear about the distinction between
the abstraction and the concrete value. I'm of the opinion that a lazy
list is a definition of the I<default> values of an array, and that the
actual values of the array override any default values. Assigning to a
previously memoized element overrides the memoized value.
It would help the optimizer to have a way to declare "pure" array
definitions that can't be overridden.
Also consider this:
    @array = (1..100, 100..10000:100);
A single flat array can have multiple lazy lists as part of it's
default definition. We'll have to keep track of that, which could get
especially tricky if the definitions start overlapping via slice
definitions.
In practice, people will treat the default values as real values. If
you pass a lazy list into a function as an array argument, the function
will probably not know or care whether the values it's getting from the
array are being generated on the fly or were there in the first place.
I can think of other cans of worms this opens, and I'm quite certain
I'm too stupid to think of them all. Nevertheless, my gut feeling is
that we can make things work more like people expect rather than less.
And I was always a little bit jealous that REXX could have arrays with
default values. C<:-)>
[Update: Turns out that all lists are lazy by default.  Use unary C<**>
to force a non-lazy list evaluation immediately.]
=head2 RFC 285: Lazy Input / Context-sensitive Input
Solving this with C<want()> is the wrong approach, but I think the
basic idea is sound because it's what people expect. And the C<want()>
should in fact be unnecessary. Essentially, if the right side of a list
assignment produces a lazy list, and the left side requests a finite
number of elements, the list generator will only produce enough to
satisy the demand. It doesn't need to know how many in advance. It just
produces another scalar value when requested. The generator doesn't
have to be smart about its context. The motto of a lazy list generator
should be, "Ours is not to question why, ours is but to do (the next
one) or die."
It will be tricky to make this one work right:
    ($first, @rest) = 1 .. Inf;
=head2 RFC 082: Arrays: Apply operators element-wise in a list context
APL, here we come... :-)
This is by far the most difficult of these RFCs to decide, so I'm going
to be doing a lot of thinking out loud here. This is research--or at
least, a search. Please bear with me.
I expect that there are two classes of Perl programmers--those that
would find these "hyper" operators natural, and those that wouldn't.
Turning this feature on by default would cause a lot of heartburn for
people who (from Perl 5 experience) expect arrays to always return
their length under scalar operators even in list context. It can
reasonably be argued that we need to make the scalar operators default,
but make it easy to turn on hyper operators within a lexical scope. In
any event, both sets of operators need to be visible from
anywhere--we're just arguing over who gets the short, traditional
names. All operators will presumably have longer names for use as
function calls anyway. Instead of just naming an operator with long
names like:
    operator:+
    operator:/
the longer names could distinguish "hyperness" like this:
    @a scalar:+ @b
    @a list:/ @b
That implies they could also be called like this:
    scalar:+(@a, @b)
    list:/(@a, @b)
We might find some short prefix character stands in for "list" or
"scalar". The obvious candidates are C<@> and C<$>:
    @a $+ @b
    @a @/ @b
Unfortunately, in this case, "obvious" is synonymous with "wrong".
These operators would be completely confusing from a visual point of
view. If the main psychological point of putting noun markers on the
nouns is so that they stand out from the verbs, then you don't want to
put the same markers on the verbs. It would be like the Germans
starting to capitalize all their words instead of just their nouns.
Instead, we could borrow a singular/plural memelet from shell globbing,
where C<*> means multiple characters, and C<?> means one character:
    @a ?+ @b
    @a */ @b
But that has a bad ambiguity. How do you tell whether C<**> is an
exponentiation or a list multiplication? So if we went that route, we'd
probably have to say:
    @a ?:+ @b
    @a *:/ @b
Or some such. But if we're going that far in the direction of
gobbledygook, perhaps there are prefix characters that wouldn't be so
ambiguous. The colon and the dot also have a visual singular/plural
value:
    @a .+ @b
    @a :/ @b
We're already changing the old meaning of dot (and I'm planning to
rescue colon from the C<?:> operator), so perhaps that could be made to
work. You could almost think of dot and colon as complementary method
calls, where you could say:
    $len = @a.length;   # length as a scalar operator
    @len = @a:length;   # length as a list operator
But that would interfere with other desirable uses of colon. Plus, it's
actually going to be confusing to think of these as singular and plural
operators because, while we're specifying that we want a "plural"
operator, we're not specifying how to treat the plurality. Consider
this:
    @len = list:length(@a);
Anyone would naively think that returns the length of the list, not the
length of each element of the list. To make it work in English, we'd
actually have to say something like this:
    @len = each:length(@a);
    $len = the:length(@a);
That would be equivalent to the method calls:
    @len = @a.each:length;
    $len = @a.the:length;
But does this really mean that there are two array methods with those
weird names? I don't think so. We've reached a result here that is
spectacularly close to a I<reductio ad absurdum>. It seems to me that
the whole point of this RFC is that the "eachness" is most simply
specified by the list context, together with the knowledge that
C<length()> is a function/method that maps one scalar value to another.
The distribution of that function over an array value is not something
the scalar function should be concerned with, except insofar as it must
make sure its type signature is correct.
And there's the rub. We're really talking about enforced strong typing
for this to work right. When we say:
    @foo = @bar.mumble
How do we know whether C<mumble> has the type signature that magically
enables iteration over C<@bar>? That definition is off in some other
file that we may not have memorized quite yet. We need some more
explicit syntax that says that auto-interation is expected, regardless
of whether the definition of the operator is well specified. Magical
auto-iteration is not going to work well in a language with optional
typing.
So the resolution of this is that the unmarked forms of operators will
force scalar context as they do in Perl 5, and we'll need a special
marker that says an operator is to be auto-iterated. That special
marker turns out to be an uparrow, with a tip o' the hat to
higher-order functions. That is, the hyper-operator:
    @a ^* @b
is equivalent to this:
    parallel { $^a * $^b } @a, @b
(where C<parallel> is a hypothetical function that iterates through
multiple arrays in parallel.)
[Update: These days hyper operators are marked with German quotes: C<»*«>.
We stole C<^> for exclusive-or junctions.]
Hyper operators will also intuit where a dimension is missing from one
of its arguments, and replicate a scalar value to a list value in that
dimension. That means you can say:
    @a ^+ 1
to get a value with one added to each element of C<@a>. (C<@a> is
unchanged.)
I don't believe there are any insurmountable ambiguities with the
uparrow notation. There is currently an uparrow operator meaning
exclusive-or, but that is rarely used in practice, and is not typically
followed by other operators when it is used. We can represent
exclusive-or with C<~> instead. (I like that idea anyway, because the
unary C<~> is a 1's complement, and the binary C<~> would simply be
doing a 1's complement on the second argument of the set bits in the
first argument. On the other hand, there's destructive interference
with other cultural meanings of tilde, so it's not completely obvious
that it's the right thing to do. Nevertheless, that's what we're
doing.)
[Update: Except we're not. Unary and binary C<~> are now string operators,
and C's bitwise ops have been demoted to longer operators with a prefix.]
Anyway, in essence, I'm rejecting the underlying premise of this RFC,
that we'll have strong enough typing to intuit the right behavior
without confusing people. Nevertheless, we'll still have easy-to-use
(and more importantly, easy-to-recognize) hyper-operators.
This RFC also asks about how return values for functions like C<abs()>
might be specified. I expect sub declarations to (optionally) include a
return type, so this would be sufficient to figure out which functions
would know how to map a scalar to a scalar. And we should point out
again that even though the base language will not try to intuit which
operators should be hyperoperators, there's no reason in principle that
someone couldn't invent a dialect that does. All is fair if you
predeclare.
=head2 RFC 045: C<||> and C<&&> should propagate result context to both
sides
Yes. The thing that makes this work in Perl 6, where it was almost
impossible in Perl 5, is that in Perl 6, list context doesn't imply
immediate list flattening. More precisely, it specifies immediate list
flattening in a notional sense, but the implementation is free to delay
that flattening until it's actually required. Internally, a flattened
list is still an object. So when C<@a || @b> evaluates the arrays,
they're evaluated as objects that can return either a boolean value or
a list, depending on the context. And it will be possible to apply both
contexts to the first argument simultaneously. (Of course, the computer
actually looks at it in the boolean context first.)
There is no conflict with RFC 81 because the hyper versions of these
operators will be spelled:
    @a ^|| @b
    @a ^&& @b
[Update: That'd be C<»||«> and C<»&&«> now.]
=head2 RFC 054: Operators: Polymorphic comparisons
I'm not sure of the performance hit of backstopping numeric equality
with string equality. Maybe vtables help with this. But I think this
RFC is proposing something that is too specific. The more general
problem is how you allow variants of built-ins, not just for C<==>, but
for other operators like C<< <=> >> and C<cmp>, not to mention all
the other operators that have scalar and list variants.
A generic equality operator could potentially be supplied by operator
definition. I expect that a similar mechanism would allow us to define
how abstract a comparison C<cmp> would do, so we could sort and collate
according to the various defined levels of Unicode.
The argument that you can't do generic programming is somewhat
specious. The problem in Perl 5 is that you can't name operators, so
you couldn't pass in a generic operator in place of a specific one even
if you wanted to. I think it's more important to make sure all
operators have real function names in Perl 6:
    operator:+($a, $b);     # $a + $b
    operator:^+(@a, @b);    # @a ^+ @b
    my sub operator:<?> ($a, $b) { ... }
    if ($a <?> $b) { ... }
    @sorted = collate \&operator:<?>, @unicode;
[Update: This role is now filled in part by the C<~~> smartmatch operator.
Also, there's no need to name hyper operators--they're always constructed
artificially.]
=head2 RFC 104: Backtracking
As proposed, this can easily be done with an operator definition to
call a sequence of closures. I wonder whether the proposal is complete,
however. There should probably be more make-it-didn't-happen semantics
to a backtracking engine. If Prolog unification is emulated with an
assignment, how do you later unassign a variable if you backtrack past
it?
Ordinarily, temporary values are scoped to a block, but we're using
blocks differently here, much like parens are used in a regex. Later
parens don't undo the "unifications" of earlier parens.
In normal imperative programming these temporary determinations are
remembered in ordinary scoped variables and the current hypothesis is
extended via recursion. An C<andthen> operator would need to have a way
of keeping BLOCK1's scope around until BLOCK2 succeeds or fails. That
is, in terms of lexical scoping:
    {BLOCK1} andthen {BLOCK2}
needs to work more like
    {BLOCK1 andthen {BLOCK2}}
This might be difficult to arrange as a mere module. However, with
rewriting rules it might be possible to install the requisite scoping
semantics within BLOCK1 to make it work like that. So I don't think
this is a primitive in the same sense that continuations would be. For
now let's assume we can build backtracking operators from
continuations. Those will be covered in a future apocalypse.
[Update: Also, the fact that Perl 6 regexes can call closures with
backtracking covers most of this functionality.  See A5 and S5.]
=head2 RFC 143: Case ignoring C<eq> and C<cmp> operators
This is another RFC that proposes a specific feature that can be
handled by a more generic feature, in this case, an operator
definition:
    my sub operator:EQ { lc($^a) eq lc($^b) }
Incidentally, I notice that the RFC normalizes to uppercase. I suspect
it's better these days to normalize to lowercase, because Unicode
distinguishes titlecase from uppercase, and provides mappings for both
to lowercase.
=head2 RFC 170: Generalize C<=~> to a special "apply-to" assignment
operator
I don't think the argument should come in on the right. I think it
would be more natural to treat it as an object, since all Perl
variables will essentially be objects anyway, if you scratch them
right. Er, left.
I do wonder whether we could generalize C<=~> to a list operator that
calls a given method on multiple objects, so that
    ($a, $b) =~ s/foo/bar/;
would be equivalent to
    for ($a, $b) { s/foo/bar/ }
But then maybe it's redundant, except that you could say
    @foo =~ s/foo/bar/
in the middle of an expression. But by and large, I think I'd rather
see:
    @foo.grep {!m/\s/}
instead of using C<=~> for what is essentially a method call. In line
with what we discussed before, the list version could be a
hyperoperator:
    @foo . ^s/foo/bar/;
or possibly:
    @foo ^. s/foo/bar/;
Note that in the general case this all implies that there is some
interplay between how you declare method calls and how you declare
quote-like operators. It seems as though it would be dangerous to let a
quote-like declaration out of a lexical scope, but then it's also not
clear how a method call declaration could be lexically scoped. So we
probably can't do away with C<=~> as an explicit marker that the thing
on the left is a string, and the thing on the right is a quoted
construct. That means that a hypersubstitution is really spelled:
    @foo ^=~ s/foo/bar/;
Admittedly, that's not the prettiest thing in the world.
[Update: The C<~~> smartmatch operator subsumes all C<=~> functionality.]
=head1 Non-RFC considerations
The RFCs propose various specific features, but don't give a systematic
view of the operators as a whole. In this section I'll try to give a
more cohesive picture of where I see things going.
=head2 Binary C<.> (dot)
This is now the method call operator, in line with industry-wide
practice. It also has ramifications for how we declare object attribute
variables. I'm anticipating that, within a class module, saying
    my int $.counter;
would declare both a C<$.counter> instance variable and a C<counter>
accessor method for use within the class. (If marked as public, it
would also declare a C<counter> accessor method for use outside the
class.)
[Update: The keyword is C<has> rather than C<my>, and a read-only
public accessor is generated by default.  See A12.]
=head2 Unary C<.> (dot)
It's possible that a unary C<.> would call a method on the current
object within a class. That is, it would be the same as a binary C<.>
with C<$self> (or equivalent) on the left:
    method foowrapper ($a, $b) {
        .reallyfoo($a, $b, $c)
    }
On the other hand, it might be considered better style to be explicit:
    method foowrapper ($self: $a, $b) {
        $self.reallyfoo($a, $b, $c)
    }
(Don't take that declaration syntax as final just yet, however.)
[Update: Unary dot turns out to a method call on the current topic.  See A4
and S4.]
=head2 Binary C<_>
Since C<.> is taken for method calls, we need a new way to concatenate
strings. We'll use a solitary underscore for that. So, instead of:
    $a . $b . $c
you'll say:
    $a _ $b _ $c
The only downside to that is the space between a variable name and the
operator is required. This is to be construed as a feature.
[Update: Nowadays concatenation is C<~>.]
=head2 Unary C<_>
Since the C<_> token indicating stat buffer is going away, a unary
underscore operator will force stringification, just as interpolation
does, only without the quotes.
[Update: That's unary C<~> now.]
=head2 Unary C<+>
Similarly, a unary C<+> will force numification in Perl 6, unlike in
Perl 5. If that fails, NaN (not a number) is returned.
=head2 Binary C<:=>
We need to distinguish two different forms of assignment. The standard
assignment operator, C<=>, works just as it does Perl 5, as much as
possible. That is, it tries to make it look like a value assignment.
This is our cultural heritage.
But we also need an operator that works like assignment but is more
definitional. If you're familiar with Prolog, you can think of it as a
sort of unification operator (though without the implicit backtracking
semantics). In human terms, it treats the left side as a set of formal
arguments exactly as if they were in the declaration of a function, and
binds a set of arguments on the right hand side as though they were
being passed to a function. This is what the new C<:=> operator does.
More below.
=head2 Unary C<*>
Unary C<*> is the list flattening operator. (See Ruby for prior art.)
When used on an rvalue, it turns off function signature matching for
the rest of the arguments, so that, for instance:
    @args = (\@foo, @bar);
    push *@args;
would be equivalent to:
    push @foo, @bar;
In this respect, it serves as a replacement for the prototype-disabling
C<&foo(@bar)> syntax of Perl 5. That would be translated to:
    foo(*@bar)
In an lvalue, the unary C<*> indicates that subsequent array names
slurp all the rest of the values. So this would swap two arrays:
    (@a, @b) := (@b, @a);
whereas this would assign all the array elements of C<@c> and C<@d> to
C<@a>.
    (*@a, @b) := (@c, @d);
An ordinary flattening list assignment:
    @a = (@b, @c);
is equivalent to:
    *@a := (@b, @c);
That's not the same as
    @a := *(@b, @c);
which would take the first element of C<@b> as the new definition of
C<@a>, and throw away the rest, exactly as if you passed too many
arguments to a function. It could optionally be made to blow up at run
time. (It can't be made to blow up at compile time, since we don't know
how many elements are in C<@b> and C<@c> combined. There could be
exactly one element, which is what the left side wants.)
=head2 List context
The whole notion of list context is somewhat modified in Perl 6. Since
lists can be lazy, the interpretation of list flattening is also by
necessity lazy. This means that, in the absence of the C<*> list
flattening operator (or an equivalent old-fashioned list assignment),
lists in Perl 6 are object lists. That is to say, they are parsed as if
they were a list of objects in scalar context. When you see a function
call like:
    foo @a, @b, @c;
you should generally assume that three discrete arrays are being passed
to the function, unless you happen to know that the signature of C<foo>
includes a list flattening C<*>. (If a subroutine doesn't have a
signature, it is assumed to have a signature of C<(*@_)> for old times'
sake.) Note that this is really nothing new to Perl, which has always
made this distinction for builtins, and extended it to user-defined
functions in Perl 5 via prototypes like C<\@> and C<\%>. We're just
changing the syntax in Perl 6 so that the unmarked form of formal
argument expects a scalar value, and you optionally declare the final
formal argument to expect a list. It's a matter of Huffman coding
again, not to mention saving wear and tear on the backslash key.
=head2 Binary C<:>
As I pointed out in an earlier apocalypse, the first rule of computer
language design is that everybody wants the colon. I think that means
that we should do our best to give the colon to as many features as
possible.
Hence, this operator modifies a preceding operator adverbially. That
is, it can turn any operator into a trinary operator (provided a
suitable definition is declared). It can be used to supply a "step"
to a range operator, for instance. It can also be used as a kind of
super-comma separating an indirect object from the subsequent argument
list:
    print $handle[2]: @args;
[Update: binary C<:> as an invocant separator is now distinguished from
adverbs that start with C<:>, so the "step" of a range is specified
using C<:by($x)> rather than a bare colon.]
Of course, this conflicts with the old definition of the C<?:>
operator. See below.
In a method type signature, this operator indicates that a previous
argument (or arguments) is to be considered the "self" of a method
call. (Putting it after multiple arguments could indicate a desire for
multimethod dispatch!)
=head2 Trinary C<??::>
The old C<?:> operator is now spelled C<??::>. That is to say, since
it's really a kind of short-circuit operator, we just double both
characters like the C<&&> and C<||> operator. This makes it easy to
remember for C programmers. Just change:
    $a ? $b : $c
to
    $a ?? $b :: $c
The basic problem is that the old C<?:> operator wastes two very useful
single characters for an operator that is not used often enough to
justify the waste of two characters. It's bad Huffman coding, in other
words. Every proposed use of colon in the RFCs conflicted with the
C<?:> operator. I think that says something.
I can't list here all the possible spellings of C<?:> that I
considered. I just think C<??::> is the most visually appealing and
mnemonic of the lot of them.
=head2 Binary C<//>
A binary C<//> operator is the defaulting operator. That is:
    $a // $b
is short for:
    defined($a) ?? $a :: $b
except that the left side is evaluated only once. It will work on
arrays and hashes as well as scalars. It also has a corresponding
assignment operator, which only does the assignment if the left side is
undefined:
    $pi //= 3;
=head2 Binary C<;>
The binary C<;> operator separates two expressions in a list, much like
the expressions within a C-style C<for> loop. Obviously the expressions
need to be in some kind of bracketing structure to avoid ambiguity with
the end of the statement. Depending on the context, these expressions
may be interpreted as arguments to a C<for> loop, or slices of a
multi-dimensional array, or whatever. In the absence of other context,
the default is simply to make a list of lists. That is,
    [1,2,3;4,5,6]
is a shorthand for:
    [[1,2,3],[4,5,6]]
But usually there will be other context, such as a multidimension array
that wants to be sliced, or a syntactic construct that wants to emulate
some kind of control structure. A construct emulating a 3-argument
C<for> loop might force all the expressions to be closures, for
instance, so that they can be evaluated each time through the loop.
User-defined syntax will discussed in apocalypse 18, if not sooner.
=head2 Unary C<^>
Unary ^ is now reserved for hyper operators. Note that it works on
assignment operators as well:
    @a ^+= 1;    # increment all elements of @a
[Update: That'd be C<»+=«> now.]
=head2 Unary C<?>
Reserved for future use.
[Update: This is now the boolean context operator, the opposite of C<!>.]
=head2 Binary C<?>
Reserved for future use.
=head2 Binary C<~>
This is now the bitwise XOR operator. Recall that unary C<~> (1's
complement) is simply an XOR with a value containing all 1 bits.
[Update: C<~> is now string concatenation.  Bitwise XOR is C<+^> or C<~^>
depending on whether your doing numeric xor or stringwise.]
=head2 Binary C<~~>
This is a logical XOR operator. It's a high precedence version of the
low precedence C<xor> operator.
[Update: C<~~> is now the smartmatch operator.  Logical XOR is C<^^>.
Junctive XOR is C<^>.]
=head2 User defined operators
The declaration syntax of user-defined operators is still up for grabs,
but we can say a few things about it. First, we can differentiate unary
from binary declarations simply by the number of arguments.
(Declaration of a return type may also be useful for disambiguating
subsequent parsing. One place it won't be needed is for operators
wanting to know whether they should behave as hyperoperators. The
pressure to do that is relieved by the explicit C<^> hypermarker.)
We also need to think how these operator definitions relate to
overloading. We can treat an operator as a method on the first object,
but sometimes it's the second object that should control the action.
(Or with multimethod dispatch, both objects.) These will have to be
thrashed out under ordinary method dispatch policy. The important thing
is to realize that an operator is just a funny looking method call.
When you say:
    $man bites $dog
The infrastruture will need to untangle whether the man is biting the
dog or the dog is getting bitten by the man. The actual biting could be
implement in either the C<Man> class or the C<Dog> class, or even
somewhere else, in the case of multimethods.
[Update: Unary and binary operators are now distinguished by prefixing
with either C<prefix:> or C<infix:>.  There are many other syntactic
categories as well.]
=head2 Unicode operators
Rather than using longer and longer strings of ASCII characters to
represent user-defined operators, it will be much more readable to
allow the (judicious) use of Unicode operators.
In the short term, we won't see much of this. As screen resolutions
increase over the next 20 years, we'll all become much more comfortable
with the richer symbol set. I see no reason (other than fear of
obfuscation (and fear of fear of obfuscation))) why Unicode operators
should not be allowed.
Note that, unlike APL, we won't be hardware dependent, in the sense
that any Perl implementation will always be able to parse Unicode, even
if you can't display it very well. (But note that Vim 6.0 just came out
with Unicode support.)
=head2 Precedence
We will at least unify the precedence levels of the equality and
relational operators. Other unifications are possible. For instance,
the C<not> logical operator could be combined with list operators in
precedence. There's only so much simplification that you can do,
however, since you can't mix right association with left association.
By and large, the precedence table will be what you expect, if you
expect it to remain largely the same.
[Update: We also got rid of the special levels for bitwise operators,
shifts, binding operators, and range operators.  On the other hand,
we added levels for junctive operators and non-chaining binaries.
Still, we managed to reduce it from 24 to 22 precedence levels.  See S3.]
And that still goes for Perl 6 in general. We talk a lot here about
what we're changing, but there's a lot more that we're not changing.
Perl 5 does a lot of things right, and we're not terribly interested in
"fixing" that.
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)