Writing PIR
PIR (Parrot Intermediate Representation) is a way to program the parrot virtual machine that is easier to use than PASM (Parrot Assembler). PASM notation is like any other assembler-like format and can be used directly, but it is more verbose and gives too much power to the user. PIR abstracts common operations and conventions into a syntax that more closely resembles a high-level language. PIR allows the programmer to write code that more naturally expresses their intent without worrying about setting up the exact details that PASM requires to function properly.
This article will show the basics on programming in PIR. More advanced topics will appear in later articles.
Getting Parrot
In order to test the PIR and PASM code in this article, a parrot virtual machine is needed (henceforth just "parrot"). Parrot is available from http://parrotcode.org. Just download the latest release, or checkout the current development version from the SVN tree. The programs in this article were tested with Parrot 0.4.3.
Parrot is very easy to compile on unix-like operating systems: just run perl Configure.pl && make
in the root directory of the parrot source and, if everything works correctly, a parrot
executable should appear. Normally, I use it directly from the root directory of the parrot source, instead of installing it into the system.
Parrot Virtual Machine overview
Before we get started with the examples, here's a quick overview of parrot's architecture.
Parrot is a register-based virtual machine. It provides 4 types of registers and each type has 32 registers. The register types are:
Registers are designated by the type of register followed by a number. For instance, the integer registers are I0
, I1
till I31
. The PMC registers hold arbitrary data objects and are parrot's mechanism for implementing more complex behavior than the ones that can be expressed using the other 3 register types alone. PMCs will be covered in more detail in a future article. Examples in this article will focus on the first 3 register types.
Simple Operators
Let me start with a simple and typical example:
.sub main :main
print "hello world\n"
.end
To run it, save the code in a hello.pir
file and pass it to the parrot virtual machine:
./parrot hello.pir
Note that I am using a relative path to parrot given that I didn't install it into the system.
The keywords starting with a dot (.sub
and .end
) are PIR macros. They are used together to define subroutines. After the .sub
keyword I use the name of the subroutine. The keyword that starts with a colon (:main
) is a attribute (more on those later) that tells parrot that this is the main body of the program and that it should start by executing this subroutine. By the way, I could use .sub foo :main
and Parrot will use the foo
subroutine as the main body of the program. The actual name of the subroutine does not matter as long as it has the attribute :main
.
Before going into more details about subroutines and calling conventions, let's compare some PIR syntax to the equivalent PASM.
If I want to add two integer registers using PASM I would use the Parrot set
opcode to put values into registers, and the add
opcode to add them, like this:
set I1, 5
set I2, 3
add I0, I1, I2 # I0 yields 5+3
PIR includes infix operators for these common opcodes. I could write this same code as
I1 = 5
I2 = 3
I0 = I1 + I2
There are the four arithmetic operators as you should be expecting, as well as the six different comparison operators, which return a boolean value:
I1 = 5
I2 = 3
I0 = I1 <= I2 # I0 yields 0 (false)
I can also use the short accumution-like operators, like +=
.
Also, PIR allows an extended syntax for registers. If the register name is prefixed with a dollar sign, like $I1
, the parrot PIR compiler will automatically assign these "virtual registers" to actual registers as needed and handle whatever manipulations are needed for optimization.
Another PIR perk is that local variable names may be declared and used instead of register names. For that I just need to declare the variable using the .local
keyword:
.local int size
size = 5
The parrot compiler will choose one register and associate it with my variable name. I can declare local variables for any of the four data types available on PIR: int
, string
, num
and pmc
.
Branching
Another simplification of PASM are branches. Basically, when I want to test a condition and jump to another place in the code, I would write the following PASM code:
le I1, I2, LESS_EQ
Meaning, if I1
is less or equal than I2
, jump to label LESS_EQ
. In PIR I would write it in a more legible way:
if $I1 <= $I2 goto LESS_EQ
PIR includes the unless
keyword as well.
Calling Functions
Subroutines can easily be created using the .sub
keyword shown before. If you do not need parameters, it just as simple as I show in the following code:
.sub main :main
hello()
.end
.sub hello
print "Hello World\n"
.end
Now, I want to make my hello
subroutine a little more useful, such that I can greet other people. For that I will use the .param
keyword to define the parameters hello
can handle:
.sub main :main
hello("leo")
hello("chip")
.end
.sub hello
.param string person
print "Hello "
print person
print "\n"
.end
If I need more parameters I just need to add more .param
lines.
To return values from PIR subroutines I use the .return
keyword, followed by one or more arguments, just like this:
.return (10, 20, 30)
Factorial Example
Now, for a little more complicated example, let me show how I would code Factorial subroutine:
.sub main :main
$I1 = factorial(5)
print $I1
print "\n"
.end
.sub factorial
.param int i
if i > 1 goto recurse
.return (1)
recurse:
$I1 = i - 1
$I2 = factorial($I1)
$I2 *= i
.return ($I2)
.end
This example also shows that PIR subroutines may be recursive just as in a high-level language.
Named Arguments
As some other languages as Python and Perl support named arguments, PIR support them as well.
As before, I need to use .param
for each named argument, but with one of the two following syntaxes:
.sub func
.param int a :named("foo") # or
.param int "bar" => b
Both of these lines say the exact same thing: the subroutine will recieve an integer named "foo", and inside of the subroutine that integer will be known as "a".
When calling the function, I need to pass the names of the arguments. For that there are two syntaxes as well:
func( 10 :named("foo") ) # or
func( "foo" => 10 )
Note that with named arguments, you may rearrange the order of your parameters at will.
.sub foo
.param string "name" => a
.param int "age" => b
.param string "gender" => c
# ...
.end
This subroutine may be called in any of the following ways:
foo( "Fred", 35, "m" )
foo( "gender" => "m", "name" => "Fred", "age" => 35 )
foo( "age" => 35, "gender" => "m", "name" => "Fred" )
foo( "m" :named("gender"), 35 :named("age"), "name" => "Fred" )
and any other permutation you can think of as long as you use the named argument syntax.
It's also possible to use named syntax when returning values from subroutines. Into the .return
command I'll use:
.return ( "bar" => 20, "foo" => 10)
and when calling the function, I will do:
("foo" => $I0, "bar" => $I1) = func()
And $I0
will yield 10, and $I1
will yield 20, as expected.
Concluding
To conclude this first article on PIR and to let you test what you learned, let me show you how to do input on PASM (hence, also in PIR). There is a read
opcode to read from standard input. Just pass it a string register or variable where you wish the characters read to be placed and the number of characters you wish to read:
read $S1, 100
This line will read 100 characters (or until the end of the line) and put the read string into $S1
. In case you need a number, just assign the string to the correct register type:
read $S1, 100
$I1 = $S1
With the PIR syntax shown in this article you should be able to start writting simple programs. Next article we will look into available PMCs, and how they can be used.
Thanks
* Jonathan Scott Duff