Eigenstate : Myrddin on Plan 9

Porting Myrddin to Plan 9

Porting Myrddin to new systems is not especially difficult. It is especially easy when porting to a Posixy system that uses the GNU toolchain, because all of the parts are in place. However the amout of system specific code that Myrddin depends on is small, and porting to more exotic systems is still easy.

There were a number of parts to the Plan 9 port. First, the backend needed to be taught how to generate Plan 9 assembly and object files. Second, the startup code needed to be written. At that point, a file that does no input and output will run.

From there, the libraries needed to be ported. They were initially designed with a clean separation between system specific code and shared code, and porting to Plan 9 only helped increase the quality of this separation.

The Differences

Plan 9 is not Unix. It isn't even Posix. It bears a family resemblance, but it does many things very differently. To produce a compiler that is natively usable takes some work. To make that compiler produce binaries that behave natively is a bit more work on top of that.

The largest differences that need to be handled:

Looking at this list, it turns out that the work isn't so hard. The biggest problems with porting involve dealing with the non-posix system calls. The rest have more to do with being a good citizen than with actual technical challenges.

The Compiler

While it's strictly possible to do the port with a cross compiler, it feels a great deal less clunky to do the work natively on the target system. Plan 9 comes with APE, the Ansi/Posix Emulation layer. It also comes with Posix make, which doesn't deal with the Gnu makefiles that I ship with. A port of GNU Make is available. However, right way was to ship with mkfiles instead.

APE is neither especially comprehensive nor solidly implemented, and I ran into a few issues with missing C99 headers such as stdint.h, as well as a minor bug with printf implementing %lld format strings incorrectly. However, the 9front community picked up my fixes extremely quickly.

As mentioned earlier, APE's version of Make is extremely basic, and supported approximately none of the libraries I had put together. Using GNU make is possible, but it isn't available by default. And while the APE compiler does follow the Posix specificiation, it's different enough that simply reusing the existing Makefiles would be uncomfortable.

The answer is to use the native build system, mk. Because Plan 9 comes with templates for building software, it takes very little effort to write mkfiles that cover the use cases. They end up looking something like:

</$objtype/mkfile
TARG=6m
OFILES=\
    blob.$O\
    ben.$O\
    ...

LIB=../util/libutil.a ../parse/libparse.a ../mi/libmi.a
HFILES=asm.h ../parse/parse.h ../mi/mi.h \
    .../config.h insns.def regs.def
BIN=/$objtype/bin

</sys/src/cmd/mkone

unintall:V:
    rm -f /$objtype/bin/$TARG

Nadia Heninger With that out of the way, getting the code running was straightforward. Plan 9 yacc accepted the grammar Myrddin uses out of the box. The C compilers warned on a few things that GCC allowed through. I fixed the warnings, and had a compiler that would generate Linux binaries on Plan 9.

The Assembly Backend

The first order of business when porting a compiler is making sure it can generate binaries for the target platform. Plan 9's toolchain is rather different from the toolchains found on most other platforms. The C ABI is different too, but because I don't link against C, I decided that I didn't care. I use the same SysV-ish ABI on both systems.

Because Myrddin doesn't care about the ABI, instructions are instructions. Porting the assembly generation was largely formatting the same data differently. Another format string was added to insns.def. A formatter was also added for blobs to write out the rather harebrained data format that Plan 9 uses.

For example, if I were compiling the following code:

const fn = {
    -> 42
}

The assembly output on a system that uses AT&T syntax would look like:

.text
fn:
    pushq %rbp
    movq %rsp,%rbp
    movl $42,%eax
.Lret0:
.L0:
    movq %rbp,%rsp
    popq %rbp
    ret

However, when compiling for the Plan 9 assemblers, it would look like:

TEXT fn<>+0(SB),$0
    PUSHQ BP
    MOVQ SP,BP
    MOVL $42,AX
.Lret0:
.L0:
    MOVQ BP,SP
    POPQ BP
    RET

The assembly format is documented best at http://9p.io/sys/doc/asm.html, although it bears a very strong family resemblance to the Golang assemblers, https://golang.org/doc/asm.

Instruction names are ALLCAPS. They mostly share the same ordering with the AT&T syntax, with some exceptions. For example, it took some time to realize out that CMP instruction did not match the argument order of the SUB instruction. As a result, the format strings in insns.def grew indexes after the %, allowing me to specify "CMP%T %2R,%1X".

After the this work, the generation of assembly code was split between gen.c , which would initialize the output state and drive the assembly generation, gengas.c which generated code for systems using assemblers compatible with AT&T syntax, and genp9.c which genereates Plan 9 syntax.

Eventually, this will go away when Myrddin moves to its own cross compiling toolchain, but that's the state of the world right now.

The Startup Code

At this point, the compiler could generate code in the format that's needed. The assembler accepted the .s files that the Myrddin compiler generated, and converted them to object files. Unlike other systems, the object files have the architecture number as the extension (in the case of amd64, obj.6), to allow cross-platform builds. However, if you try to link them into a binary, the linker will complain about the entry point (main) not being defined.

So, we need to define it. The first thing the code needs to do stash away the information that we will need later in the system run.

First off,

TEXT    _main(SB), 1, $(2*8+NPRIVATES*8)
    MOVQ    AX, sys$tosptr(SB)
    LEAQ    16(SP), AX
    MOVQ    AX, _privates(SB)
    MOVL    $NPRIVATES, _nprivates(SB)

This code stores %rax into the variable named sys$tosptr

The System Calls

The Libraries

Debugging

bash: bash:: command not found [ori@oneeye ~/src/www/eigenstate]$ [ori@oneeye ~/src/www/eigenstate]$ + bash: [ori@oneeye: command not found [ori@oneeye ~/src/www/eigenstate]$ bash: +: command not found bash: bash:: command not found [ori@oneeye ~/src/www/eigenstate]$ [ori@oneeye ~/src/www/eigenstate]$ [ori@oneeye ~/src/www/eigenstate]$ description : Notes on Software Development bash: [ori@oneeye: command not found [ori@oneeye ~/src/www/eigenstate]$ bash: description: command not found bash: bash:: command not found [ori@oneeye ~/src/www/eigenstate]$ [ori@oneeye ~/src/www/eigenstate]$ } bash: [ori@oneeye: command not found [ori@oneeye ~/src/www/eigenstate]$ bash: syntax error near unexpected token `}'

[ori@oneeye ~/src/www/eigenstate]$

The Code

References