Author: M.L.Gassanenko
(c) M.L.Gassanenko, 1996-98
History:
ver. 1 � published in Forth Dimensions, v.XVII, n.5, Jan-Feb 1996
ver.2.1 � converted to HTML
Download all in a PKZIP archive
 

Scattering a Colon Definition

M.L. Gassanenko
mlg@forth.org

The Problem

The technique presented here is used in a prefix notation Forth assembler that performs initialization actions (variables resetting, etc.) every time before processing a new instruction. The assembler has many switches that get set according to the defaults and the instruction operands and then determine what to do. Some switches are "functional": executing they execute a Forth word, setting or resetting determines which word is to be executed. (The switches are implemented as multi-cfa words.) Due to these numerous switches definitions of instruction groups are as readable as instruction formats, but the initialization code grows. Had the initialization code been written as a separate definition, it would occupy two dense-typed block screens; sequential files could not be a solution because one dense screen of sequential file isn't better than two blocks.

So, the problem is that initialization actions belong to two different modules at the same time: to the module they initialize and to the general initialization module. We want to distribute these actions so that they will be located in the modules they initialize, but used as a single definition.

The Solution

The solution scatters the code of the initialization definition over the screens where the things to be initialized are used.
The words ... , ..: , and ;.. are used as follows:
: INIT ... <some initialization actions> ;
<something>
..: INIT <more initialization actions> ;..
<something>
..: INIT <more initialization actions> ;..
<and so on>
The generated code looks like this:

How the generated code looks
 

Implementation


The implementation code below is unstandard, but very portable.

With the definitions:

\ fetch/store a reference that e.g. follows a BRANCH
: REF@ ( orig -- dest ) DUP @ + ;          \ the branch addresses
: REF! ( dest orig -- ) TUCK - SWAP ! ;     \      are relative

\ add size of a compiled token
: TOKEN+ ( addr -- addr' ) CELL+ ;

We might define:

: >MARK    ( -- orig ) HERE 0 , ;
: >RESOLVE ( orig -- ) HERE SWAP REF! ;
: <MARK    ( -- dest ) HERE ;
: <RESOLVE ( dest -- ) HERE CELL ALLOT REF! ;

And now we can define:

: ... COMPILE BRANCH >MARK >RESOLVE ; IMMEDIATE
: ..: ' >BODY TOKEN+ DUP REF@ SWAP >RESOLVE !CSP 400 ] ;
: ;.. 400 ?PAIRS ?CSP COMPILE BRANCH <RESOLVE [COMPILE] [ ; IMMEDIATE

In F-PC there may be some problems with long jumps and long addresses. Note that a new branching word is defined:

: BRANCHL 2R> REF@ 2>R ;

F-PC with its double-cell addresses isn't well-suited for return address manipulations and code generation tricks. The F-PC code is given below:

anew scatter.seq

: REF@L ( orig-seg orig-off --- dest-seg dest-off )
           2DUP   2+ @L XSEG @ +   -ROT   @L ;
: REF!L ( dest-seg dest-off orig-seg orig-off --- )
           2DUP 2>R   !L    XSEG @ - 2R> 2+ !L ;
: TOKEN+ 2+ ;

: >MARKL    ( -- Dorig ) XHERE 0 0 X, X, ;
: >RESOLVEL ( Dorig -- ) XHERE 2SWAP REF!L ;
: <MARKL    ( -- Ddest ) XHERE ;
: <RESOLVEL ( Ddest -- ) XHERE 0 0 X, X, REF!L ;

: BRANCHL 2R> REF@L 2>R ;
: >TCODE ( cfa -- seg off ) >BODY @ XSEG @ + 0 ;

: ?PAIRS XOR ABORT" NON-PAIRED WORD" ;

: ... COMPILE BRANCHL >MARKL >RESOLVEL ; IMMEDIATE
: ..: ' >TCODE TOKEN+ 2DUP REF@L 2SWAP >RESOLVEL !CSP 400 ] ;
: ;.. 400 ?PAIRS ?CSP COMPILE BRANCHL <RESOLVEL [COMPILE] [ ; IMMEDIATE

Why a Special Construct

The evident benefit of this tool is that programmer does not have to modify the initialization definition when he adds a new mechanism to the growing program. Deleting of a mechanism also becomes painless: if you do not load a block, its initialization does not get compiled.

In F-PC this problem is usually solved by means of DEFER variables. We think that a special construct is better because it is:

1) laconic;

2) more readable: the purpose may be understood at the first glance;

3) uses no auxiliary names (which have no meaning in itself).

Conclusion


The technique presented here enables programmer to distribute fragments of code that should execute as one definition across the modules they logically belong to.

Listing 1 � for the traditional architecture
Listing 2 � for F-PC