Wednesday, November 15, 2006

Dynamically optimizing dynamic dispatch

Robert O'Callahan just explained to me the basic optimization approach for dynamically typed languages that I've been hearing about lately. It's the approach used in the Strongtalk VM, recently mentioned by Avi Bryant and Gilad Bracha. At least, I think that's what they were talking about, but I didn't understand it until Robert explained it to me.

Performing a dynamic dispatch compiles into a branch deciding which class's method to invoke:
if (class == C) {
C::foo();
}
else if (class == B) {
B::foo();
}
else ...
With some runtime profiling, you can figure out which branch is most likely to occur and dynamically rewrite the branch so that the common case comes first in the branch.
if (class == B) {
B::foo();
}
else if (class == C) {
C::foo();
}
else ...
This then feeds into the hardware branch prediction.

With branch prediction, the hardware will keep some small history to help it predict which way to take. But if the particular branch in question is not in the history (say, because it's part of a loop with a large body whose branch history is bigger than the hardware branch history table), a conditional branch is an either-or; the branch predictor will still pick one. So as long as you structure the conditional so that the branch predictor will always guess the common path, even absent any branch history, it will still predict correctly in the common case.

This is something that you can't do with a vtable, because branch history tables are smaller for computed jumps (since each entry is a pointer rather than a single branch-taken/branch-not-taken bit), and because the hardware predictor can't guess right when it doesn't have a history.

You can then do even better by actually inlining the method body directly into the branching code, so it looks like
if (class == B) {
// B::foo method body
}
else ...
and then even better, by merging several inlined methods in sequence. With some simple dataflow analysis on locals (between mutations), you can eliminate redundant checks. That is, code that looks like:
if (x.class == B) {
// B::foo method body
}
else ...
if (x.class == B) {
// B::bar method body
}
else ...
gets optimized to:
if (x.class == B) {
// B::foo method body
// B::bar method body
}
else ...

1 comment:

Karsten Wagner said...

I think the idea was mentioned first in:

"Efficient Dynamic Dispatch without Virtual Function Tables. The SmallEiffel Compiler." Olivier ZENDRA, Dominique COLNET, Suzanne COLLIN.
12th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'97),
Volume 32, Issue 10 - Atlanta, GA, USA, October 1997, pages 125-141.

The paper is downloadable from the SmallEiffel homepage.