NAME

[DRAFT] Apocalypse 20 - Debugging [DRAFT]

VERSION

Maintainer:		pugs committers
Date:			10 Apr 2005
Last modified:	10 Apr 2005

This document proposes how debugging and AST introspection hooks might look in perl 6, in the context of MMD and macros.

PURPOSE

An attempt at drafting a callback API that should allow useful and simple implementation of:

  • the perl 6 debugger

  • infectious traits, enabling the implementation of taint mode and more complex variations on the theme.

EXAMPLE

With taint mode generalized into a debugging system, this perl statement will yield:

$x = ($a ?? $b !! $c); # assuming $a is true

Assuming:

Xs is the whole statement
Xe is the single expression in Xs, it is apply(=, $x, ...);
E is the ( ?? !! ) expression, it is apply(??!!, A, B, C);
A, B, C are expressions for the vars $a, $b, $c ($x expr omitted for brevity)

The following callbacks:

eval(Xs);
	eval(Xe);
		eval(E);
			apply(??!!, A, B, C);
				eval(A);
				# evaluated
				participated(A, $a); # $a is a value, perhaps is copy
				participated(E, $a);
				reduced(A, $a is copy); # also a value, even more so
				eval(B); # because A reduced to a true value
				participated(E, $a, $b);
				participated(B, $b);
				participated(E, $b);
				reduced(B, $b);
		reduced(E, $b);
		participated(Xe, $b); # by the reduction, we now know that $b interacts in Xe
		participated(E, $x, $a, $b); # the order is from largest permutation to smallest
		participated(E, $x, $a);
		participated(E, $x, $b);
		participated(Xe, $x, $b);
		participated(E, $x);
		participated(Xe, $x);
		apply(=, $x, $b); # the same $b that E was reduced to
	participated(Xe, $x, $b);
	reduced(Xe, $x); # but really $b by now
participated(Xs, $a, $b, $x); # a sort of catchall summary of the permuitations called therein

Participation is noted for every combination of values in an expression, as soon as it can be determined.

participated(A, $a) is called once, as soon as possible. By that time we know that
participated(E, $a) should also be called, so we do that
participated(B, $b) there after. We now know that
participated(E, $a, $b) should also happen, so we do that too

Note that the reduced values are also participating:

substr($x, 3, 5);

...
	reduced(X, $substr); # $substr is a new value containing the retun value
	participated(X, $substr);
	participated(X, $substr, $x);
...

And literals are too. participated is called on 3 and 5 in that example as well.

OVERHEAD VS FEASABILITY

You use interaction between interaction whoring values and variable container values, and then apply that to expressions at the AST level, and then you can figure out from there the subset of suspected variables/values who might care about being told they participated in the expr.

This is just like pre-dispatching MMD at the compiler level.

Basically, you use the system to determine where it will be necessary. Isn't that cool?

If it's unused it should be zero overhead.

THEORETICAL USAGE

DEBUGGING TRAPS

multi sub trap_eval (Statement $x where { $x == $trapped }) {
	...
}

(i don't know how trap_eval will be named)

THE TAINTED TRAIT

A tainted trait should work by trapping

participated(_, $x is tainted, $y isn't tainted);

and then setting tainted on $y. This requires that the MMD dispatch doesn't care if

participated(X, $a, $b is tainted);

becomes

participated(X, $b, $a);

Similarly, file handles and system calls are supplemented with more specific MMD dispatches, whose prototype contains 'does tainted', to disallow tainted content, or wrapped to mark taintedness.

Untainting is either explicit, by use of functions which just remove the trait, or implicitly:

an trap apply of the match operator (or perhaps 'reduced' on an expression containting the application of a rule match... We'll see how the macro AST for perl expressions will look) will untaint the captures inside the rule match object, after they were infected by being derived.

You could also define expressions which taint using macros:

taints {
	...
}

the macro will take the AST for the block it encompasses, and taint everything that goes on inside the block.

DATA SEGREGATION

my $user1_dbh does segregated :parent($user1);

my $user2_dbh does segregated :parent($user2);

data coming from $user1_dbh will be segregated too, belonging to the object $user1.

when data from $user1 and $user2 meets, and exception is thrown.

If i'm implementing an IRC channel, for example, all input from a user's socket will be segregated.

Non commands will be unsegregated.

In this case a fatal error is thrown if due to a bug in the server the user's password as input by a command will be sent to a chatroom, which will try to make that data interact with other users sockets.

CODE COVERAGE METRICS

Just make a trap on statements, and make the callback record.

This model allows rather deep introspection, useful for detecting dead code.

PROFILING (OR NOT)

Just like you could do coverage analysis for code, you could profile it, if you can calculate the overhead of the hook MMDs out of the way. Perhaps counters on the ast also knowing time are in order. This might require parrotting though, but could definately be wrangled by these callbacks later.

We'll see.