NAME

IMCC - operation

VERSION

0.1 intital
0.2 uninitialized warning; optimizations

OVERVIEW

This document describes the principles of imcc operation.

DESCRIPTION

The main features of imcc are:

Source file parsing
Register allocation
Optimization
Code generation
Running the code

Source file parsing

S. parsing.pod, macros.pod and syntax.pod for more.

Register allocation

Register allocation is done per compilation unit.

Imcc identifiers and temporary variables e.g. $I0 are assigned a physical parrot register depending on the life range of these variables. If the life range of one variable doesn't overlap the range of another variable, these might get the same parrot register.

$I0 = 10
$I1 = 20

will translate to

set I0, 10
set I0, 20

when $I0 is not used after these lines.

The first redundant assignment will be optimized away with -O2.

PASM registers keep their register. During the usage of a PASM register this register will be not get assigned to. So don't use these over long code pieces and only when really needed.

Basic blocks

To determine the life range of variables, the code gets separated into pieces, called basic blocks. A basic block starts at a label, which can get jumped to, and ends at a branch instruction.

Special case: subroutine calls (bsr)

Currently, imcc assumes, that a subroutine saves all registers on entry, as P6C does per saveall, and thus a bsr doesn't count as a branch.

But there is one exception to this rule: When the target of the subroutine is known (the label is in the same compilation unit, and when no saveall instruction is found in the first 5 instructions after the label, then the control flow from the bsr call to the label and from the ret to the next basic block is calculated.

Call graph

All connection between the basic blocks are calculated. This allows for:

Loop detection

where the range and depth of loops is calculated.

Life analysis

Whenever an operand is marked as an OUT argument, this operand starts with a new value. This means, that at this point the life range of the symbol ends and a new life range is started, which allows the allocation of a different register to the same variable or the same register to a different variable.

Variables used as IN parameters must keep their parrot register over their usage range.

When imcc detects a register usage, where the first operation is using (reading) a register (and warnings are enabled), imcc emits an appropriate message.

Consider these to code snippets (block numbers are attached):

 .sub _main
0      $I0 = 0		# initialized
0      if $I0 goto l1
1      $I1 = 1		# init in block 1
1      goto l2
2  l1:
2      $I1 = 2		# init in block 2
3  l2:
3      print $I0
3      print $I1	# all paths leading here do init
3      print "\n"
3      end
 .end

and:

 .sub _main
0      $I0 = 0		# initialized
0      if $I0 goto l1	# branch to bb 1 or 2
1      $I1 = 1		# init only in block 1
2  l1:
2      print $I0
2      print $I1	# no init in code path from block 0
2      print "\n"
2      end
 .end

The latter of these emits the warning:

warning:imcc:propagate_need: '$I1' might be used \
uninitialized in _main:7

Interference graph

With above information calculated, the next step is to look, which variable does interfer with which other. Non interfering variables can be given the same parrot register.

Register allocation

imcc then starts allocating registers according to a variables score. Variables deeply nested inside loops have the highest score and get a parrot register first. Variables with a long life range (i.e. with many interferences) get allocated last.

Spilling

When there are too many interferences, so that not all variables can be allocated to a parrot register, variables with lowest score are spilled, or stored in a PerlArray when they are not used.

set $I1,1
set $I2,2
...
set $I33, 33
...
print $I1
...
print $I33

The usage of all these variables makes it necessary, to spill.

new P31, .PerlArray
...
set I0, 1
set P31[0], I0	# store $I1
set I0, 2
set P31[1], I0	# store $I2
...
set I0, P31[0]	# fetch $I1
print I0

When $I1 or $I2 are accessed then, code is generated to fetch the value from the spill array. The life range of such a spilled variable is then only one fetch and/or store operation and the actual usage. Therefore the variable has minimal interferences with other variables.

Optimization

Optimizations are only done, when enabled with the -O switch. Please consult t/imcpasm/*.t for examples. They occur between various stages and may be repeatedly done: E.g. after converting a conditional branch to an absolute one, unreachable code will be removed then, which might cause unused labels ...

Optimizations with -O1

Constants substitution

Constant arguments to many ops are evaluated. Conditional branches with constant conditions are converted to unconditional branches. Integer arguments to float operations are converted to float operands.

If branch optimization

A sequence of code:

   if cond, L1
   branch L2
L1:
...
L2:

will be converted to

   unless cond, L2
   ...
L2:

The same is done for other conditional branches gt, ge, eq and their reverse meanings.

Branches to branches

Unconditional branch sequences get optimized to jump to the final label.

Unused labels

Labels not referenced somewhere get deleted.

Dead code removal

Code not reachable after an unconditional branch instruction and basic blocks, that are not entered from somewhere get removed.

Optimizations with -O2

Note: These are currently experimental and might not do the Right Thing.

Used LHS once

For a sequence of code

$I0 = 10
$I1 = 20

where $I0 is not used beyond, the first assignment will be tossed, resulting in code like:

set I0, 20

Loop optimization.

Instructions which are invariant to a loop are pulled out of the loop and inserted in front of the loop entry.

Code generation

Imcc either generates PASM output ready to get assembled by assemble.pl or directly a PBC file, for running with parrot.

Running code

Additionally the generated code can be run immediately inside imcc. All parrot runtime options like -j or -t are available.

FILES

imc.c, cfg.c, optimizer.c, pbc.c

AUTHOR

Leopold Toetsch <lt@toetsch.at>

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 23:

'=item' outside of any '=over'

Around line 33:

You forgot a '=back' before '=head1'