Zero

Test

A minimal assembler and emulator for the Zero programming language.

The goal is to implement N-Way trees in Zero assembler code, optimize the assembler code assiduously through exhaustive testing, then realize the algorithm as an integrated circuit rather than as software on a conventional CPU so that large, extremely fast associative memories can be manufactured on an industrial scale.

Open the Actions tab to see the code in action.

The initial idea is to produce a small CPU which implements just the instructions needed to implement the algorithm. The small CPU will then be replicated across an fpga so that the tree can be queried in parallel.

Only one tree will be used: typically mapping 64 bit keys into 64 bit data. It will be useful to add additional data at the front of the keys such as data length, data position, process id, userid, time stamp etc. As the keys are sorted in the tree, trees with similar prefixes will tend to collect together so we can compress out the common prefix of the keys in each node to make better use of memory. Strings longer than 64 bits can be processed in pieces with each piece prefixed by the string id and the position in the string. Incoming strings can be made unique by assigning a unique 64 bit number to each prefix of the string so that a second such string can be easily recognized. Once such a long string has been represented by unique number it can be located in one descent through the tree; although a single traversal of the tree will no longer yield such strings in alphabetic order.

All communications with the chip will be done via USB . Incoming read requests can be done in parallel as long as there are processors left to assign work to. An update will have to wait for all existing finds to finish while stalling all trailing actions until the update is complete.

Associative look ups are the sine qua non of all Turing complete programming languages. This arrangement should produce very fast associative look ups - much faster than can be performed by any generic system reliant on external software. Usage of power and integrated circuit surface area should be reduced by having a minimal CPU to perform the look ups. Being able to deliver such look ups faster than can be done with conventional software solutions might prove profitable in much the same way as graphics chips and other chips used at scale.

Memory is addressed via named areas which act as flexible arrays with the usual indexing, push, pop, index, iteration, resizing and scan operations. Each procedure has its own stack frame implemented as a stack frame area, parameter area and return results area. Each area can grow as much as is needed to hold data. Additional user memory areas can be allocated and freed as necessary. Communication with other systems can be achieved by reading and writing to arrays with predetermined names.

Well known locations are represented by character == non numeric area ids. Stack frames, parameter and return areas are represented by negative area ids.

References can represent constants via a scalar with zero levels of dereferencing; direct addresses by scalars with one level of dereferencing, and indirect addresses by scalars with two levels of dereferencing. A reference consisting of an area, an offset within an area and an area name are represented as an array reference with three entries. A reference to a location in the current stack frame is represented as a single scalar with the appropriate levels of dereferencing. The area name is used to confirm that the area being processed is the one that should be being processed.

If you would like to be involved with this interesting project, please raise an issue saying so!

Emulator:

The emulator converts a Perl representation of the assembly source code to executable instructions and then executes these instructions.

Documentation

Code

Tests

N-Way-Tree

An implementation of N-Way-Trees in Zero assembler.

Documentation

Code

Tests

For documentation see: CPAN