The advantages in modularity and power of microkernel-based operating systems
such as Mach 3.0 are well known. The existing performance problems of these
systems, however, are significant. Much of the performance degradation is due
to the cost of maintaining separate protection domains, traversing software
layers, and using a semantically rich inter-process communication mechanism.
An approach that optimizes the common case is to permit merging of protection
domains in performance critical applications, while maintaining protection
boundaries for debugging or in situations that demand robustness. In our
system, client calls to the server are effectively bound either to a simple
system call interface, or to a full RPC mechanism, depending on the server's
location. The optimization reduces argument copies, as well as work done in
the control path to handle complex and infrequently encountered message types.
In this paper we present a general method of doing this for Mach 3.0 and the
results of applying it to the Mach microkernel and the OSF/1 single server. We
describe the necessary modifications to the kernel, the single server, and the
RPC stub generator. Semantic equivalence, backwards compatibility, and common
source and binary code are preserved. Performance on micro and macro
benchmarks is reported, with RPC performance improving by a factor of three,
Unix system calls to the server improving between 20% and a factor of two, and
4-13% performance gain on large benchmarks. A breakdown of the times on the
RPC path is also presented.