OCaml for efficient virtual machines
Subject: OCaml for efficient virtual machines
Newsgroups: gmane.comp.lang.ocaml.beginners
Date: 2008-11-01 15:21:40 GMT
A friend and I are currently evaluating OCaml as the implementation
language for a virtual machine. The interpreted language is yet
undecided, but we expect it to be some intermediate bytecode for a
simple language such as Scheme or Self.
Currently we have a few issues and would greatly appreciate any
pointers or comments on them.
* Dynamically Compiled Code.
We need to dynamically compile code in the running VM. Our current
attack on this is to use a native ia32 emitter in OCaml and pass
the emitted code to a C function that can then start
executing it. We have looked at arrays and buffers as byte containers,
but the fact that OCaml tags non-pointers in these containers
makes this method cumbersome. We would need to remove the code tags in
C before executing it. Also, if we do not undo the removal of tags we
would hinder the garbage collector from collecting some unused
objects. Now if OCaml has some collection that can contain raw bytes
and is garbage collection aware we could use that. Something like
Self's byte array. We have not been able to find a reference to such a
container in OCaml.
* Threaded Code.
We would like our bytecode interpreter to be as fast as possible in
order to reduce the amount of compiling that needs to be done. A
common technique is to use a ``threaded code'' interpreter. Here every
operation is given by a label and the current operation can jump
directly to the following operation. This can greatly reduce the
amount of miss-predictions in the branch prediction unit.
We can implement this with a (named or unnamed) function for each
opcode and in each of them the invocation of the next opcode is in
tail-position.
A simple interpreter:
let op0 r pc ops = r;;
let op1 r pc ops = ops.(pc) (r+1) (pc+1) ops;;
let op2 r pc ops = ops.(pc) (r-1) (pc+1) ops;;
let interp ops = ops.(0) 0 1 ops;;
We have omitted the appropriate function to create the initial opcode
array and the code requires the -rectypes in order to compile. This compiles
to near optimal code and all parameters reside in registers. However,
the tail calls are indirect jumps that must first go through
`caml_apply3'.
Assembly output for op0 and op1:
op0:
ret
op1:
movl -2(%ecx, %ebx, 2), %edx
addl $2, %ebx
addl $2, %eax
jmp caml_apply3
# movl 8(%edx), %esi
# jmp *%esi
Here we have noted in the comments what we would have liked to see.
Why OCaml does not do this is still not clear to us. In the code for
caml_apply3 there is a check on some property of the invoked
function. It seems that under some situations the tail call can result
in a call instead of a jump. This puzzles us a bit as the fact that
the application is in tail position of the caller should be enough, no?
We would really like to understand why `caml_apply3' works this way.
If we perform our optimization by hand we beat the C version of the
same interpreter compiled with GCC (with labels-as-values extension).
The most efficient interpreter we have been able to produce directly
with OCaml is the defunctionalized version of the above.
Unfortunately, that again causes all operations to jump to the same
`apply' method dispatching over an ADT of operations. This is however
still faster than the double dispatch incurred by `caml_apply3'.
Thank you for taking the time to read this post. We look forward to
any remarks.
Kind regards, Ian and Thomas.
Assembly of caml_apply3:
caml_apply3:
subl $8, %esp
movl 4(%edx), %esi
cmpl $7, %esi # what are we checking?
jne .L447
movl 8(%edx), %esi
addl $8, %esp
jmp *%esi # we always end here
.align 16
.L447:
movl %ecx, 4(%esp)
movl %ebx, 0(%esp)
movl (%edx), %ecx
movl %edx, %ebx
call *%ecx
.L449:
movl %eax, %ebx
movl (%ebx), %ecx
movl 0(%esp), %eax
call *%ecx
.L450:
movl %eax, %ebx
movl (%ebx), %ecx
movl 4(%esp), %eax
addl $8, %esp
jmp *%ecx
Change settings via the Web (Yahoo! ID required)
Change settings via email: Switch delivery to Daily Digest | Switch format to Traditional
Visit Your Group | Yahoo! Groups Terms of Use | Unsubscribe
__,_._,___
RSS Feed