Virtual Virtual Machine

3. Platform-specific notes

This section defines the syntax accepted by the preprocessor for each of the supported platforms, and briefly mentions any significant differences between the ``standard'' assembly language conventions for the platform and those supported by the preprocessor and runtime assemblers.

The following subsections assume familiarity with the assembly language of each processor. No attempt is made to explain obscure (but standard) assembly language features.

3.1 PowerPC

The preprocessor and assemblers follow the conventions given in:

PowerPC Microprocessor Family: The Programming Environments For 32-Bit Microprocessors, Motorola, 1997.

The destination operand is always on the left, immediate operands always on the right, and registers are always prefixed with `r' to distinguish them from immediate values. The immediate opcodes have an optional suffix `i':


add     r3, r4, r5      # load r3 with the sum r4 + r5
addi    r3, r4, 5       # load r3 with the sum r4 + 5
add     r3, r4, 5       # synonym for "addi r3, r4, 5"

Memory addresses are indicated by a base register in parentheses preceded by an immediate offset, or two registers with the `x' form of the instruction:


lwz     r3, r4(5)       # load r3 from the word after r5
lwzx    r3, r4, r5      # load r3 from the word at r4+r5
lwz     r3, r4(r5)      # synonym for "lwzx r3, r4, r5"

Opcodes that side-effect the condition codes have a dot (`.') suffix:


add     r3, r4, r5      # load r3 with the sum of r4 + r5
add.    r3, r4, r5      # load r3 with the sum of r4 + r5 and set cr0 accordingly

Simplified mnemonics

Many of the simplified branch mnemonics, and all of the simplified mnemonics for non-branch instructions, are recognised.

Limitations

The branch and compare instructions support optional condition code register and bit operands. There is no assembler support for the specification of these as symbolic constants. They are however treated as immediate operands, and so it is trivial to define constants for these registers and bit positions when needed. For example:


#cpu ppc

#define cr0     0
#define cr1     1
#define cr2     2
#define cr3     3

#define lt      0
#define gt      1
#define eq      2
#define so      3

void aCodeGenerator(void)
#[
        cmpl    cr2, 0, r3, r4
        bt      cr2*4+eq, equalLabel
        bt      cr2*4+lt, lessLabel
]#

(This corresponds precisely to the syntax defined by Motorola.) If the condition code register is not specified in compare and branch instructions then it is implicitly cr0.

3.2 Sparc

Opcode and operand syntax is as defined in

SPARC International, The SPARC Architecture Manual, Version 8, Prentice-Hall, 1992.

Registers are prefixed with `%r'. They can also be specified by their alternative mnemonic forms:

%g0 .. %g7

correspond to registers %r0 through %r7

%o0 .. %o7

correspond to registers %r8 through %r15

%l0 .. %l7

correspond to registers %r16 through %r23

%i0 .. %i7

correspond to registers %r24 through %r31

%sp

corresponds to register %o6 (or %r14)

%fp

corresponds to register %i6 (or %r30)

The two operators for extracting the low and high portions of immediates are supported:

%lo(imm32)

is the low 10 bits of imm32

%hi(imm32)

is the high 22 bits of imm32, shifted right by 10 bits without sign extension (suitable for use as the immediate operand for a sethi instruction)

All of the ``synthetic'' instructions listed in SPARC International documentation are supported (including the set instruction which expands to one or two instructions depending on the size of its immediate argument).

Memory operands

The suggested Sparc syntax does not place square brackets around memory locations when the location itself is the operand (and not the contents of the location); e.g. for jump instructions:


jmpl    %o7+8, %g0

Handing this in a clean manner was too painful, and such instructions must therefore use the bracketed memory notation:


jmpl    [%o7+8], %g0

Computed register numbers

Computed registers require parentheses around the register number, for example:


add     %i(argIn), %i(argIn+1), %o(argOut)
ld      [%r(baseReg)+offset], %r(destReg)
ld      [%r(baseReg)+%r(indexReg)], %r(destReg)
jmpl    [%r(isLeafProc ? 31 : 15)+8], %g0       # choose ret or retl

Comment character

SPARC International suggest `!' as the comment character. In ccg it remains the usual `#' although it can be changed trivially by including


#comment !

near the start of the source file, or by specifying ``-c !'' on the ccg command line.

3.3 Pentium

The opcode names follow the specification given in:

Intel Architecture Software Developer's Manual Volume 2: Instruction Set Reference, Intel Corporation, 1997.

Source and destination operand positions

There is serious disagreement between Intel and the Free Software Foundation (the GNU people) over the position of the source/destination operands. Intel syntax specifies the first operand as the destination (data moves from right to left), for example:


movl    %eax, $42       # load %eax with 42

On the other hand the GNU assemblers and disassemblers all put the destination in the final operand (data moves from left to right):


movl    $42, %eax       # move 42 into %eax

We choose to follow the FSF/GNU convention since it is more logical, intuitive, aesthetically pleasing, and possibly even more ethically sound (considering the identity of Intel's most [in]famous client).

Computed register indices

As in the Sparc assembler, ``computed'' register indices must be placed in parentheses after the `%' prefix. For example:


void gen_load(int value, int destReg)
{
  #[
        movl    $value, %(destReg)
  ]#
}

...
  gen_load(42, #( %eax )#);
...

Additional pseudo-ops

The Pentium target provides an additional pseudo-op ``.align N'', where N is an alignment (expressed in bytes) between 1 and 8.

Assembler complexity

Compared with RISC processors, the Pentium has a very complex instruction set. The runtime assembler must perform a lot of dynamic analysis to determine the optimal instruction to generate for a given combination of opcode and operands. It is therefore rather complex (consisting of approximately 640 macro definitions, many of which contain hairy conditional expressions to select the correct opcode prefix byte(s), sign-extension and operand width bits, and so on). A given Pentium opcode can generate anything from 1 to 13 bytes of code, depending on the nature of the operands that are passed to it.

Because of this complexity a Pentium dynamic assembler statement ultimately generates many hundreds of characters of C code (after the compiler has passed the .c file through cpp) containing deep conditional expressions. The compiler has to work quite hard to optimise this mess. (The insatiably curious might like to run ``gcc -E'' on a .c file produced by ccg to see the extent of the problem.) For example, with gcc the compile-time overhead is about 50-100 kilobytes of virtual memory for each dynamic assembler statement. Under these conditions it is not difficult to cause the compiler to run out of memory.

The solution is to split the dynamic code sections over several different C functions, limiting each function to no more than a hundred or so assembler statements. (The compiler discards some intermediate structures after processing each function.) For very complex code generators the functions containing dynamic code sections can be placed in independent source files to further reduce the memory requirements.

Note that this problem does not exist with the runtime assembers for PowerPC and Sparc, which both have significantly simpler instruction formats requiring significantly less runtime analysis.