NAME

Apocalypse_01 - The Ugly, the Bad, and the Good

AUTHOR

Larry Wall <larry@wall.org>

VERSION

Maintainer: Larry Wall <larry@wall.org>
Date: 2 Apr 2001
Last Modified: 27 Sep 2004
Number: 1
Version: 2

People get scared when they hear the word Apocalypse, but here I mean it in the good sense: a Revealing. An Apocalypse is supposed to reveal good news to good people. (And if it also happens to reveal bad news to bad people, so be it. Just don't be bad.)

What I will be revealing in these columns will be the design of Perl 6. Or more accurately, the beginnings of that design, since the design process will certainly continue after I've had my initial say in the matter. I'm not omniscient, rumors to the contrary notwithstanding. This job of playing God is a little too big for me. Nevertheless, someone has to do it, so I'll try my best to fake it. And I'll expect all of you to help me out with the process of creating history. We all have to do our bit with free will.

If you look at the history of Perl 6 up to this point, you will see why this column is subtitled The Ugly, the Bad, and the Good. The RFC process of last year was ugly, in a good sense. It was a brainstorming process, and that means it was deliberately ugly--not in the sense of incivility, since the RFC process was in fact surprisingly civil, but in the sense that there was little coherent design to the suggestions in the RFCs. Frankly, the RFCs are all over the map, without actually covering the map. There are contradictory RFCs, and there are missing RFCs. Many of the RFCs propose real problems but go off at funny angles in trying to propose solutions. Many of them patch symptoms without curing the underlying ailments.

I also discovered Larry's First Law of Language Redesign: Everyone wants the colon.

That was the Ugly part. The Bad part was that I was supposed to take these RFCs and produce a coherent design in two weeks. I starting out thinking I could just classify the RFCs into the good, bad, and ugly categories, but somehow most of them ended up in the ugly category, because the good ones typically had something wrong with them, and the even the bad ones typically indicated a problem that could use some thought, even if the solution was totally bogus.

It is now five months later, and I've been mulling over coherence the whole time, for some definition of mulling. Many of you know what happens when the size of your Perl process exceeds the size of your physical memory--you start thrashing. Well, that's basically what happened to me. I couldn't get enough of the problem into my head at once to make good progress, and I'm not actually very good at subdividing problems. My forte is synthesis, not analysis. It didn't help that I had a number of distractions in my life, some of them self-inflicted, and some of them not. I won't go into all that. Save it for my unauthorized autobiography.

But now we come to the Good part. (I hope.) After thinking lots and lots about many of the individual RFCs, and not knowing how to start thinking about them as a whole, it occurred to me (finally!) that the proper order to think about things was, more or less, the order of the chapters in the Camel Book. That is, the Camel Book's order is designed to minimize forward references in the explanation of Perl, so considering Perl 6 in roughly the same order will tend to reduce the number of things that I have to decide before I've decided them.

So I've merrily classified all the RFCs by chapter number, and they look much more manageable now. (I also restructured my email so that I can look at a slice of all the messages that ever talked about a particular RFC, regardless of which mailing list the message was on. That's also a big help.) I intend to produce one Apocalypse for each Chapter, so Apocalypse 1 corresponds to Chapter 1: An Overview of Perl. (Of course, in the book, the Overview is more like a small tutorial, not really a complete analysis of the philosophical underpinnings of Perl. Nevertheless, it was a convenient place to classify those RFCs that talk about Perl 6 on that level.)

So today I'm talking about the following RFCs:

RFC  PSA  Title
---  ---  -----
 16  bdb  Keep default Perl free of constraints such as warnings and strict.
 26  ccb  Named operators versus functions
 28  acc  Perl should stay Perl.
 73  adb  All Perl core functions should return objects
141  abr  This Is The Last Major Revision

The PSA rating stands for "Problem, Solution, Acceptance". The problem and solution are graded on an a-f scale, and very often you'll find I grade the problem higher than the solution. The acceptance rating is one of

a  Accepted wholeheartedly
b  Accepted with a few "buts"
c  Accepted with some major caveats
r  Rejected

I might at some point add a "d" for Deferred, if I really think it's too soon to decide something.

RFC 141: This Is The Last Major Revision

I was initially inclined to accept this RFC, but decided to reject it on theological grounds. In apocalyptic literature, 7 is the number representing perfection, while 6 is the number representing imperfection. In fact, we probably wouldn't end up converging on a version number of 2*PI as the RFC suggests, but rather on 6.6.6, which would be rather unfortunate.

So Perl 7 will be the last major revision. In fact, Perl 7 will be so perfect, it will need no revision at all. Perl 6 is merely the prototype for Perl 7. :-)

Actually, I agree with the underlying sentiment of the RFC--I only rejected it for the entertainment value. I want Perl to be a language that can continue to evolve to better fit the problems people want to solve with it. To that end, I have several design goals that will tend to be obscured if you just peruse the RFCs.

First, Perl will support multiple syntaxes that map onto a single semantic model. Second, that single semantic model will in turn map to multiple platforms.

Multiple syntaxes sound like an evil thing, but they're really necessary for the evolution of the language. To some extent we already have a multi-syntax model in Perl 5; every time you use a pragma or module, you are warping the language you're using. As long as it's clear from the declarations at the top of the module which version of the language you're using, this causes little problem.

A particularly strong example of how support of multiple syntaxes will allow continued evolution is the migration from Perl 5 to Perl 6 itself. See the discussion of RFC 16 below.

Multiple backends are a necessity of the world we live in today. Perl 6 must not be limited to running only on platforms that can be programmed in C. It must be able to run in other kinds of virtual machines, such as those supported by Java and C#.

RFC 28: Perl should stay Perl.

It is my fond hope that those who are fond of Perl 5 will be fonder still of Perl 6. That being said, it's also my hope that Perl will continue trying to be all things to all people, because that's part of Perl too.

While I accept the RFC in principle (that is, I don't intend to go raving mad), I have some major caveats with it, because I think it is needlessly fearful that any of several programming paradigms will "take over" the design. This is not going to happen. Part of what makes Perl Perl is that it is intentionally multi-paradigmatic. You might say that Perl allows you to be paradigmatic without being "paradogmatic".

The essence of Perl is really context sensitivity, not just to syntactic context, but also to semantic, pragmatic, and cultural context. This overall philosophy is not going to change in Perl 6, although specific context sensitivities may come and go. Some of the current context sensitivities actually prevent us from doing a better job of it in other areas. By intentionally breaking a few things, we can make Perl understand what we mean even better than it does now.

As a specific example, there are various ways things could improve if we muster the courage to break the "weird" relationship between @foo and $foo[]. True, we'd lose the current slice notation (it can be replaced with something better, I expect). But by consistently treating @foo as an utterance that in scalar context returns an array reference, we can make subscripts always take an array reference, which among other things fixes the botch that in Perl 5 requires us to distinguish $foo[] from $foo->[]. There will be more discussion of this in Apocalypse 2, when we'll dissect ideas like RFC 9: Highlander Variable Types.

RFC 16: Keep default Perl free of constraints such as warnings and strict.

I am of two minds about this debate--there are good arguments for both sides. And if you read through the discussions, all those arguments were forcefully made, repeatedly. The specific discussion centered around the issue of strictness, of course, but the title of the RFC claims a more general philosophical position, and so it ended up in this Apocalypse.

I'll talk about strictness and warnings in a moment, and I'll also talk about constraints in general, but I'd like to take a detour through some more esoteric design issues first. To my mind, this RFC (and the ones it is reacting against), are examples of why some language designer like me has to be the one to judge them, because they're all right, and they're all wrong, simultaneously. Many of the RFCs stake out polar positions and defend them ably, but fail to point out possible areas of compromise. To be sure, it is right for an RFC to focus in on a particular area and not try to do everything. But because all these RFCs are written with (mostly) the design of Perl 5 in mind, they cannot synthesize compromise even where the design of Perl 6 will make it mandatory.

To me, one of the overriding issues is whether it's possible to translate Perl 5 code into Perl 6 code. One particular place of concern is in the many one-liners embedded in shell scripts here and there. There's no really good way to translate those invocations, so requiring a new command line switch to set "no strict" is not going to fly.

A closely related question is how Perl is going to recognize when it has accidentally been fed Perl 5 code rather than Perl 6 code. It would be rather bad to suddenly give working code a brand new set of semantics. The answer, I believe, is that it has to be impossible by definition to accidentally feed Perl 5 code to Perl 6. That is, Perl 6 must assume it is being fed Perl 5 code until it knows otherwise. And that implies that we must have some declaration that unambiguously declares the code to be Perl 6.

Now, there are right ways to do this, and wrong ways. I was peeved by the approach taken by DEC when they upgraded BASIC/PLUS to handle long variable names. Their solution was to require every program using long variable names to use the command EXTEND at the top. So henceforth and forevermore, every BASIC/PLUS program had EXTEND at the top of it. I don't know whether to call it Bad or Ugly, but it certainly wasn't Good.

A better approach is to modify something that would have to be there anyway. If you go out to CPAN and look at every single module out there, what do you see at the top? Answer: a "package" declaration. So we break that.

I hereby declare that a package declaration at the front of a file unambiguously indicates you are parsing Perl 5 code. If you want to write a Perl 6 module or class, it'll start with the keyword module or class. I don't know yet what the exact syntax of a module or a class declaration will be, but one thing I do know is that it'll set the current global namespace much like a package declaration does.

Now with one fell swoop, much of the problem of programming in the large can be dealt with simply by making modules and classes default to strict, with warnings. But note that the default in the main program (and in one liners) is Perl 5, which is non-strict by definition. We still have to figure out how Perl 6 main programs should distinguish themselves from Perl 5 (with a "use 6.0" maybe?), and whether Perl 6 main programs should default to strict or not (I think not), but you can already see that a course instructor could threaten to flunk anyone who doesn't put "module Main" at the front each program, and never actually tell their pupils that they want that because it turns on strictures and warnings.

Other approaches are possible, but that leads us to a deeper issue, which is the issue of project policy and site policy. People are always hankering for various files to be automatically read in from various locations, and I've always steadfastly resisted that because it makes scripts implicitly non-portable. However, explicit non-portability is okay, so there's no reason our hypothetical class instructor could not insist that programs start with a "use Policy;" or some such.

But now again we see how this leads to an even deeper language design issue. The real problem is that it's difficult to write such a Policy module in Perl 5, because it's really not a module but a meta-module. It wants to do "use strict" and "use warnings" on behalf of the student, but it cannot do so. Therefore one thing we must implement in Perl 6 is the ability to write meta-use statements that look like ordinary use statements but turn around and declare other things on behalf of the user, for the good of the user, or of the project, or of the site. (Whatever. I'm not a policy wonk.)

So whether I agree with this RFC really depends on what it means by "default". And like Humpty Dumpty, I'll just make it mean whatever I think is most convenient. That's context sensitivity at work.

I also happen to agree with this RFC because it's my philosophical position that morality works best when chosen, not when mandated. Nevertheless, there are times when morality should be strongly suggested, and I think modules and classes are a good place for that.

[Update: Nowadays the main program is also strict by default, as soon as we know it's Perl 6 and not Perl 5. If we say something like "use 6.0;" as the first thing, we know it's Perl 6, and thus we know it's strict. (Likewise saying #!/usr/bin/perl6 would default to strict.) I think we could recognize a shorter, less formal form without the use which would default to not being strict:

v6;
$x = 1; # global is legal

(Without the v6 it would default to Perl 5.) Scripts supplied with a -e would also not be strict. It's possible we can allow argumentless -e as the last thing on the shebang line to suppress strict in the code below. Of course, there's always "no strict".]

RFC 73: All Perl core functions should return objects

I'm not sure this belongs in the overview, but here it is nonetheless. In principle, I agree with the RFC. Of course, if all Perl variables are really objects underneath, this RFC is trivially true. But the real question is how interesting of an object you can return for a given level of performance. Perl 5's objects are relatively heavyweight, and if all of Perl 6's objects are as heavy, things might bog down.

I'm thinking that the solution is better abstract type support for data values that happen to be represented internally by C structs. We get bogged down when we try to translate a C struct such a struct tm into an actual hash value. On the other hand, it's rather efficient to translate a struct tm to a struct tm, since it's a no-op. We can make such a struct look like a Perl object, and access it efficiently with attribute methods as if it were a "real" object. And the typology will (hopefully) mostly only impose an abstract overhead. The biggest overhead will likely be memory management of a struct over an int (say), and that overhead could go away much of the time with some amount of contextually aware optimization.

In any event, I just want to point out that nobody should panic when we talk about making things return objects that didn't used to return them. Remember that any object can define its stringify and numify overloadings to do whatever the class likes, so old code that looks like

print scalar localtime;

can continue to run unchanged, even though localtime might be returning an object in scalar context.

RFC 26: Named operators versus functions

Here's another RFC that's here because I couldn't think of a better place for it.

I find this RFC somewhat confusing because the abstract seems to suggest something more radical than the description describes. If you ignore the abstract, I pretty much agree with it. It's already the case in Perl 5 that we distinguish operators from functions primarily by how they are called, not by how they are defined. One place where the RFC could be clarified is that Perl 5 distinguishes two classes of named operators: named unary operators vs list operators. They are distinguished because they have different precedence. We'll discuss precedence reform under Apocalypse 3, but I doubt we'll combine the two kinds of named operators. (As a teaser, I do see ways of simplifying Perl's precedence table from 24 levels down to 18 levels, albeit with some damage to C compatibility in the less frequently used ops. More on that later.)

Do you begin to see why my self-appointed job here is much larger than just voting RFCs up or down? There are many big issues to face that simply aren't covered by the RFCs. We have to decide how much of our culture is just baggage to be thrown overboard, and how much of it is who we are. We have to smooth out the migration from Perl 5 to Perl 6 to prevent people from using that as an excuse not to adopt Perl 6. And we have to stare at all those deep issues until we see through them down to the underlying deeper issues, and the issues below that. And then in our depths of understanding, we have to keep Perl simple enough for anyone to pick up and start using to get their job done right now.

Stay tuned for Apocalypse 2, wherein we will attempt to vary our variables, question our quotes, recontextualize our contexts, and in general set the lexical stage for everything that follows.