Virtual Virtual Machine

2. Using `ccg`

This section introduces the platform-independant parts of the assembly language syntax accepted by ccg, and lists the directives that it recognises. It begins by describing the overall layout of a program file that uses ccg. This first subsection necessarily uses a few features before they are explained in detail. These detailed explanations of assembly language syntax and preprocessor directives are given in the subsequent subsections.

2.1 Program organisation

Two elements are required in any program that uses the ccg preprocessor. These are:

an initial processor type declaration;

followed by any number of runtime code generation sections consisting of the following:

a directive indicating the start of the runtime code section;
a directive controling the address at which code is generated;
dynamic assembler statements written using a syntax natural for the assembly language of the selected processor type;
a directive indicating the end of the runtime code section.

First is the directive to select of the processor type (in this case the Intel Pentium):


#cpu pentium

This directive has two effects:

it determines the assembler syntax for opcodes and operands that is accepted by the preprocessor; and
it inserts into the output an #include directive specific to the selected target processor (to include the header file that implements the appropriate runtime assembler).

Since it causes the inclusion of a header file, it must appear at the outermost (global) scope of the program source, outside any function definition.

Next come the runtime code generation sections. These must occur inside a function defintition, since they are converted into executable C statements that perform code generation. There are two such directives:

#[

begins a runtime code section that requires no resolution of forward references (and which can therefore be generated in a single pass); the other

#{

begins a section which is generated in two passes, in order to correctly resolve forward references.

The ``begin'' directive is followed by a pseudo-op specifying the address at which code will be generated. Assuming that we have allocated a suitable region of memory and stored the address in the variable codeBuf, we would write:


.org    codeBuf

We can now write assembly language statements. Each statement is written using the natural syntax for the processor selected previously by the #cpu directive. Using the Pentium as an example, we might write two statements that implement a trivial function that returns an integer constant:


movl    $42, %eax
ret

(Note that the assembly language statements can refer freely to C program elements, such as variables or constants that were defined with #define.)

The end of the code section is delimited by a directive matching the one that began the code section; i.e:

]#

for single-pass sections, or

}#

for two-pass sections.

Note that the begin/end directives are semantically equivalent to C braces, and therefore constitute a compound statement:


if (someCondition) #[
  ... generate consequent code
]# else #[
  ... generate alternate code
]#

We can now call the dynamically generated code, for example by casting the buffer addres into a ``int function of void'' and then calling the result as a function:


int result= (int(*)(void))codeBuf();    /* execute the generated code */
printf("%d\n", result);

The complete program would look like this:


#include <stdio.h>

#cpu pentium

int main()
{
  insn codeBuf[1024];
  int result;
  #[
        .org    codeBuf
        movl    $42, %eax
        ret
  ]#
  result= (int(*)(void))codeBuf();
  printf("%d\n", result);
  return 0;
}

Space is reserved for the code by declaring an array of insns. This type is defined by the #cpu directive to be appropriate for the target processor (currently unsigned int for 32-bit RISC processors and unsigned char for the Pentium).

Note that in the above example the dynamically generated code is placed in ``automatic'' storage. There are probably not very many other examples of C programs that execute functions whose code is in the program's stack!

2.2 Assembly language statements

Dynamic assembly language statements have similar (although simplified) conventions to regular static assemblers. Each statement consists of up to four elements:

name: (optional)
defines a label with the given name. The label is a C variable (which must already have been declared, either explicitly in C or using the .label pseudo-op. (The corresponding code in the output file just stores the current assembly address into the variable name.)
opcode (optional)
this is transformed into call on a runtime assembler macro to perform the assembly of the instruction named by opcode.
operands... (optional)
a comma-separated list of operands which are transformed into the arguments passed to the runtime assembler macro. (Note that on most architectures the types of the operands affect the name of the runtime assembler macro to permit ``overloading'' of a single opcode with several instructions taking different numbers or types of operands.)

Note that a dot (``.'') appearing anywhere in an operand is converted into the address of the opcode associated with the operand.
# comment... (optional)
everything between the ``#'' and the end of the line is converted into a C-style comment in the output.

Preprocessor directives are similar in appearance to the directives recognised by cpp, the regular C preprocessor:

#directive [argument]
where the ``#'' must appear in the first column for the directive to be recognised.

The directives are described in the next section.

Instead of an instruction (an opcode plus zero or more operands), an assembly language statement can take the form of a ``pseudo-op'':

.pseudoOp [argument]
where the ``#'' must appear in the first column for the directive to be recognised.

Most of these are common to all target platforms, although some targets may provide additional platform-dependent pseudo-ops. The pseudo-ops are described in the next section.

The preprocessor checks all opcodes and operands for correct syntax as it processes the source file. It also checks that the operand types are legal for the opcode. If the preprocessor does not generate any error messages then the resulting .c[c] file should compile without any warnings or errors related to the runtime assembler. (If it does then you've found a bug in the preprocessor and/or runtime assembler macro definitions.)

The preprocessor is also careful to preserve the correspondance between input lines and output lines. Line numbers in error messages from cpp or the C compiler can therefore be related directly to the original ccg input file.

The following sections describe the preprocessor in detail. We begin by explaining the required elements of a program that uses runtime code generation. After that we will describe all of the preprocessor directives and common pseudo-ops that are available, and finish by giving a simple but complete example using many of the preprocessor's features.

2.3 Preprocessor directives and assembler pseudo-ops

Several directives can only appear before the #cpu directive (the ``preamble''), and some can appear anywhere. They are presented in these two categories.

Preamble directives

If present, the following directives must appear before the #cpu directive:

#localpc
indicates that the program will supply its own definitions of asm_pc and asm_pass. (These variables are normally declared by the target header file. The #localpc directive is primarily useful when there are multiple program files that are logically related and contain dynamic code sections: all but one of these files should include the #localpc directive to avoid multiple definitions. It is also useful when the function containing the dynamic code section wants to define asm_pc and/or asm_pass as a register variables to improve performance.)

Zero or more of the above are followed by the required platform selection directive:

#cpu platform
selects platform as the target. Only one #cpu directive can appear in a given program file. It must appear after any #localpc directive.

General directives

The following directives can appear anywhere:

#quiet
disables warning messages. (Normally a warning is issued whenever a label is defined inside a single-pass (#[ ... ]#) code block.)
#comment commentChar
changes the current comment character. The default comment character is a hash (`#').
#escape escapeChar
changes the current escape character. The default escape character is an exlamation point (`!', sometimes also called ``bang'' or ``pling'').
#[
begins a single-pass dynamic code section. Forward references are not correctly resolved in these sections.
#{
begins a two-pass dynamic code section. Forward references are correctly resolved in these sections.
]#
ends a single-pass dynamic code section.
}#
ends a two-pass dynamic code section.

Pseudo-ops

The following pseudo-ops can only appear withing a dynamic code section:

.org address
specifies the address at which the next dynamically generation instruction will be stored. At least one .org must be executed before any dynamic code is generated.
.label name...
declares one or more temporary program labels that can be used within the current dynamic code section. Note that labels declared in this manner must obey the placement restrictions of the host language. (In C they must appear before any assembly language statements, pseudo-ops or escaped C statements within dynamic code sections. In C++ there are no such restrictions, since declarations can appear anywhere.)
! anything
an ``escaped'' C statement. anything is included ``verbatim'' in the output program. This is useful for embedding arbitraty C statements within dynamic code blocks. Note that the `!' must appear in the first column for it to be recognised, and that it takes precedence over the comment character (in the case where the comment character and/or escape characters have been redefined to the same thing).

2.4 Complete example

We will illustrate the preprocessor and runtime assembler using a simple (but complete) example for the Pentium. The program is shown first, followed by a line-by-line explanation.


 1  #cpu     pentium
 2  #comment !
 3  #escape  @
 4
 5  #include <stdio.h>
 6
 7  insn codeBuffer[1024];                  /* buffer for generated code */
 8
 9  int main()
10  {
11    typedef void (*pvfi)(int);            /* Ptr to Void Func of Int */
12    pvfi myFunction= (pvfi)codeBuffer;    /* the generated function */
13    insn *start, *end;                    /* a couple of labels */
14    /* generate some code... */
15    #[
16  @       printf("assembling: pass %d\n", asm_pass);
17          .org    myFunction              ! generate code at this address
18          ! prologue
19  start:  pushl   %ebp
20          movl    %esp, %ebp
21          ! body 
22          pushl   8(%ebp)                 ! incoming argument
23          pushl   $(int)"generated %d bytes\n"
24          call    printf
25          ! epilogue
26          leave
27          ret
28  end:
29    ]#
30    /* call the generated code, passing its size as argument */
31    myFunction(end - start);
32    return 0;
33  }

The program first selects the Pentium as the target processor (line 1). It then changes the comment and escape characters to `!' (in the Sparc style) and `@', respectively (lines 2 and 3). Line 13 declares two variables which will be assigned interesting addresses during code generation. The dynamic code section begins on line 15. Line 16 is an escaped C statement that prints the current assembly pass, followed by the pseudo-op to set the initial program address on line 17. The two ``external'' label variables are assigned using the usual label syntax on lines 19 and 28. The dynamic code section is closed on line 29.

(Note that format string passed as an argument to the printf function on line 23 is a C expression. All arguments (including register numbers) can be given as arbitrary expressions, and can refer freely to the values of C variables in effect at the time the code is generated.)

Running the above program generates the following output:


assembling: pass 1
assembling: pass 2
generated 18 bytes

The first two lines of output are generated during the two passes through the dynamic code section during runtime assembly. The last line of output is generated by the generated code itself (from the call to printf on line 24 of the program).

2.5 Computed with register numbers

The runtime assemblers often encode registers with a number not obviously related to the ``index'' of the register. For example, the registers %g0, %o0, %l0 and %i0 on the Sparc all have different constant values that are not obvious to the client program. The situation is more complex on the Pentium where the registers %al, %ah, %ax and %eax all refer to the same physical register but have four different encodings to allow the assembler to choose the appropriate opcode width and operands.

Providing the program with access to the register encodings for a particular plaform could be done by defining macros in the runtime assembler, but this is ugly. Instead, ccg provides a ``mini'' assembler section, delimited with #( and )#, which can contain a single register operand (and nothing else!). The value of this expression is the encoding used by the assembler for the given register.

The following examples are intentionally a little obscure in places, to illustrate the possibilities. On the Sparc:


static const int stackPointer  = #( %sp )#;
static const int framePointer  = #( %fp )#;

void genReturnReg(int index)
{
  if ( #(%r(index))# != #(%i0)# ) #[
        mov     %r(index), %i0
  ]#
  #[
        ret
        restore
  ]#
}

void genReturnArg(index)
{
  genReturnReg(#(%i(index))#);
}

And on the pentium:


int addImmediate(Datum *data)
{
  int destReg= 0;
  switch (data->size) {
    case 1:      destReg= #( %cl  )#; break;
    case 2:      destReg= #( %cx  )#; break;
    case 4:      destReg= #( %ecx )#; break;
    default: abort();
  }
  generateAddInto(data->value, destReg);
  return destReg;
}

2.6 Synchronising the instruction and data caches

As with any dynamic code generation system, the data and instruction caches must be flushed before executing the generated code. (This is not necessary in the simple examples in this document because all Pentium machines appear to have unified caches that do not require explicit synchronisation.)

It is the client program's responsibility to flush the caches. All of the runtime assemblers provide the macro


iflush(insn *firstAddress, insn *lastAddress)

to perform this flushing. (On the Pentium this is a no-op. On the Sparc and PowerPC, failure to correctly flush the caches will result in an illegal instruction trap.) Typical use would be:


void aCodeGenerator(void)
#[
        .org    codeAddress
        ... statements ...
        ! iflush(codeAddress, asm_pc);
]#

(asm_pc contains address of the byte after the last instruction that was generated).

2.7 Using `ccg` with `make`

The following implicit Makefile rules will sensibly control the regeneration of .c and .cc files from the corresponding cgg input files (with .cg and .ccg extensions, respectively).


# path to ccg preprocessor
CCG:=           ccg/ccg

.SUFFIXES:      .cg .ccg

%.c:            %.cg
                $(CCG) -o $@ $<

%.cc:           %.ccg
                $(CCG) -o $@ $<

# include the following if you want to keep intermediate .c[c]
# files (e.g. for source file listing during debugging)
.PRECIOUS:      %.c %.cc

The distribution includes the file ccg.mk which contains the above definitions and rules. To use it simply make a link to the ccg distribution directory in your source directory and then add


include ccg/ccg.mk

to your existing Makefile.

2.8 Useful runtime assembler macros

Several of the macros used internally by the runtime assemblers are probably useful to the client program. They include:

_uiP(W,UI): is a boolean predicate returning ``true'' if the unsigned integer UI can be fully represented in a field of W bits.
_siP(W,SI): is a boolean predicate returning ``true'' if the signed integer SI can be fully represented in a field of W bits.
_MASK(W): returns an unsigned long in which the bottom W bits are set, and all other bits cleared.