| |
PowerPC Kernel Binary Interface
page maintained by Joshua LeVasseur (jtl@ira.uka.de)
The Problem
The kernel is compiled with gcc, and gcc supports a few standards for
inter-function calling conventions. Considering the goals of a microkernel
environment, L4Ka::Pistachio uses a modified SVR4 embedded
ABI. The kernel deviates from the embedded SVR4 ABI in terms of register
allocation for callee-saved versus caller-saved registers, which are
adjusted via gcc command lines switches.
But one issue remains in the SVR4 ABI which severely impacts the performance
of the kernel's main code paths, due to the abstraction of kernel data
types in bitfields and classes.
The Solution
A patch to gcc has been developed internally, but isn't yet ready for
release.
Details
Rather than rewrite a description of the problem, I quote my post to
the freebsd-ppc list
(refer to list archives for the
entire thread):
From: Joshua LeVasseur
Date: Sat Aug 17, 2002 10:29:44 PM Europe/Berlin
To: freebsd-ppc@freebsd.org
Subject: freebsd-ppc: gcc's SysV ABI and parameter passing
While analyzing the code generated by gcc, I noticed that gcc implements
quite a literal interpretation of the System V ABI for parameter
passing.
From the spec:
"A struct, union, or long double, any of which shall be treated as a
pointer to the object, or to a copy of the object where necessary to
enforce call-by-value semantics. Only if the caller can ascertain that
the object is "constant" can it pass a pointer to the object itself."
Some example code, which declares a union for representing a 32-bit
bit-field (common for kernel code).
-----------------------
class simple_t {
public:
union {
unsigned raw;
struct {
unsigned yoda : 16;
unsigned vader : 16;
} x;
};
};
int add( simple_t a, simple_t b )
{
return a.raw + b.raw;
}
int main( void )
{
simple_t a, b;
a.raw = 1;
b.raw = 2;
return add( a, b );
}
----------------------
gcc, using the SysV ABI, will generate the following code:
add:
lwz r0,0(r3)
lwz r3,0(r4)
add r3,r0,r3
blr
main:
li r0,1
li r9,2
stw r0,8(r1)
stw r9,12(r1)
addi r3,r1,8
addi r4,r1,12
bl add
Notice how gcc writes the values to the stack before calling add(), and
then loads them off the stack in the add() function. Rather than
passing them as 32-bit parameters.
Now inspect the code generated by an alternative ABI (I use a modified
eabi to generate tight code). This code is also generated by Apple's
MachO ABI, and the AIX ABI (although I lack access to an AIX box).
add:
add r3,r3,r4
blr
main:
li r3,1
li r4,2
b add
|