[!CrackMonkey!] The Microkernel Scam

Miles Nordin carton at Ivy.NET
Sat May 25 08:49:31 PDT 2002


>>>>> "dm" == Don Marti <dmarti at zgp.org> writes:

    dm> Doesn't it have a microkernel or something to make it all
    dm> microkernel-studly and shit?

great.  okay, here we go.

Microkernels are mostly discredited now because they have performance
problems, and the benefits that they originally promised are a
fantasy.

The microkernel idea is that the kernel should be several processes
isolated from each other with memory protection, while monolithic
kernels circumscribe and implement the kernel as ``the part of the
system which would not benefit from memory protection.'' 

Microkernel advocates come to Unix from Windows 3.1 and think memory
protection is an abstract, unquestionable ``good.''  It's sort of like
``more security,'' as if security were a 1-dimensional concept.

There are three common reasons for memory protection:

 1. to help debug programs under development with less performance
    cost than instrumentation.  (instrumentation is what Java or
    Purify uses) The memory protection hopefully makes the program
    crash nearer to the bug than without it, while instrumentation
    is supposed to make the program crash right at the bug.

 2. to minimize the inconvenience of program crashes.

 3. to keep security promises even when programs crash.

``Because MS-DOS doesn't have it and MS-DOS sucks'' is not a reason
for memory protection.

The (1) is somewhat legit for microkernels.  On QNX you can debug
device drivers and programs with the same debugger, and this makes QNX
drivers easier to write.  QNX programmers are neat because drivers are
so easy for them to write that they don't seem to have the same idea
of what a ``driver'' is that we do---they think everything that does
any abstraction of hardware is a driver.  I think the good debugging
tools for device drivers is why QNX is still a commercially viable
microkernel.  Their bogon claims about ``stability'' become suspicious
once you actually start working---it's all about ease of development
and debugging.

But (2) is silly.  A real microkernel in the field will not recover
itself when the SCSI driver process or the filesystem process crashes.
Granted, if there's a developer at the helm who can give it a shove
with some special debugging tool, it might, but that advantage is
really more like (1) than (2).

Since microkernel processes cooperate to implement security promises,
the promises are not necessarily kept when one of the processes
crashes.  Therefore (3) is also silly.

All these factors together mean that memory protection is not very
useful inside the kernel, except maybe for kernel developers.

The performance problem is that modern CPUs optimize for the
monolithic kernel.  The monolithic kernel is mapped into every user
process's virtual memory space, but it's marked somehow so that the
kernel pages are only accessible when the supervisor bit is set.  When
a process makes a system call, the supervisor bit is set and unset
when the call enters and returns, so the kernel pages are lit up and
walled off by flipping a single bit.  Since the virtual memory map
doesn't change during the system call, the processor can retain all
the map fragments that it has cached in its TLB.

With a microkernel, almost everything that used to be a system call
becomes passing a message to another process.  This means that
flipping a supervisor bit is not enough to implement the memory
protection, since you have 1 user process + n kernel processes, but
only two states to a single bit.  Instead, the microkernel must switch
the virtual memory map at least twice for every system call
equivalent---once from the user process to the system process, and
once again from the system process back to the user process.  This
requires more overhead than flipping a supervisor bit.  There's more
overhead just to juggle the maps, and then there are also two TLB
flushes.

Another microkernel problem is with the ``zero copy'' concept.  The
idea here is that you should copy around memory as little as possible.
If an application wants to read a file into memory, ideally you would
like the application to mmap(..) the file, and have the disk
controller's DMA engine write the file's contents directly into the
same physical memory that the application will read.  Obviously it
takes some cleverness to do this, but memory protection is one of the
main obstacles.  You'll see comments all over the kernel about how
something has to be ``copied out to userspace.''  Microkernels make
the problem of copying much worse, because there are more memory
protection barriers to copy across, and because data has to be copied
into and out of the formatted messages that they pass around.  I guess
you could introduce shared memory as another way for microkernel
processes to communicate, and somehow bring back zero-copy through a
formal memory-protected framework, but I don't think this is happening
so far.

NetBSD's UVM is Chuq Cranor's rewrite of virtual memory under the
``zero copy'' philosophy.  He claims a 20% speed improvement in some
situations.  I don't know what that means, but you could make a
histogram of it.  Some of his speed improvement no doubt comes from
cleaner code, but the bulk of his Ph.D. thesis is that he's saving
processor cycles by doing fewer bulk copies.

vxworks has bragged about zero-copy the longest, with it's TCP stack.
They probably care mostly about reduced memory footprint, but it
should also be faster than a traditional TCP stack.  vxworks has NO 
memory protection.

BeOS implements the TCP stack in a user process, microkernel-style, as
does QNX.  Both have notoriously slow TCP stacks.

There are some other things about microkernels that I don't
understand, like how does a microkernel handle file access with
mmap(..)?  How do microkernels avoid swap deadlocks?  How do QNX
drivers service interrupts---do they really switch to a new virtual
memory map while interrupts are still blocked?

    dm> "True, linux is monolithic, and I agree that microkernels are
    dm> nicer." -- Linus Torvalds

This is because Linus does not understand threads.  Linux's POSIX
threads are still implemented as ``foolish threads,'' apeing the
architecture employed by Windows NT, while everyone else is moving on
to the better performance of scheduler activations or Masuda-Inohara
``unstable threads.''  Rather than budgeting CPU features like context
switching and virtual memory carefully for maximal returns, he's
trying to hoarde every feechur that sounds cool while shunning every
feechur that he doesn't understand.

Before I sign off, I should point out that Mach and QNX have different
ideas about what is micro enough to go into the microkernel.  In QNX,
only message passing, context switching, and a few process scheduling
hooks go into the microkernel.  Drivers for the disk, the console, the
network card, all the hardware devices, are ordinary processes that
show up next to the user's programs in 'sin' or 'ps', and that obey
'kill'.  If you want, you can kill them and crash the system.

Mach puts anything that accesses hardware into the microkernel.  so,
under Mach, XFree86 STILL shouldn't be a user process.  The ``single
server'' abuses of microkernels like mkLinux meant that Linux made a
system call (not message passing) into Mach whenever it needed to
access any Apple hardware.  I think that's why Apple supported
mkLinux---they could hoarde all the drivers for their proprietary
hardware, all the work _they_ were funding, inside Mach where it was
under a more favorable license.  Monolithic Linux/ppc thus had to redo
all their work.  I don't know that they actually reused any of the
mkLinux code in Darwin.  It's an interesting possibility, but I think
the sets of hardware supported by mkLinux and Darwin are disjoint.
Anyway, in QNX, device drivers do not go into the microkernel like
that.

-- 
commerce could only take place under the umbrella of a temporary
framework erected by powerful individuals and their gangs.  Contracts
were personal and the gift economy took its most sadistic form ("He
made me an offer I couldn't refuse").  -- Keith Hart




More information about the Crackmonkey mailing list