[SGVLUG] Microsoft Reports OSS Unix Beats Windows XP

Sun Nov 13 18:25:56 PST 2005

On Sun, 13 Nov 2005, Chris Smith wrote:

> On 11/12/05, Dustin <laurence at alice.caltech.edu> wrote:

> > The OS kernel strikes me as the worst place ever for GC.  I'm not too
> > worried about that becoming a production system.
> 
> Actually, I'd argue quite the opposite. The Linux kernel is rife with
> reference counted data structures, which is essentially a poor man's
> GC. If nothing else using GC would improve the ability for kernel
> operations to run concurrently, probably help reduce fragmentation of
> kernel space, and might even allow for faster amortized times for
> memory management.

Well, OK, I shouldn't make that such a blanket statement.  I was getting 
at one particular thing, not so much your points.  But they're interesting 
points, so let's talk about them.

GC is a topic of some interest to me, since I was raised by anti-GC
fundamentalists and later encountered pro-GC fundamentalists.  It's hard 
to find useful data, but I have some and it doesn't reassure me greatly.

The last sentence is key.  Yes, you can get higher throughput with "real"  
GC (actually, I think ref counting is as "real" as any other method,
unless reaping cycles is fundamental to your definition of GC reality),
and *maybe* on a webserver (or anywhere else where transactions per second
is the figure of merit, not responsiveness) it might make sense.

But...the memory overhead of GC is startling, and downplayed by GC bigots
(just like the time and space overhead of malloc is downplayed by anti-GC
bigots). Every time I've seen the calculation that says GC is faster than
manual allocation it is done in the infinite memory limit.  Further, I am
not aware of any "real"  collector that doesn't stomp all over the heap,
and such calculations also don't take into account either cache misses or
paging, so the extra memory that is required to start reaping the benefits
of that calculation is larger than one might think.  All told, it seems to
take a factor of *several* to get that performance.  Under memory
pressure, the swapping is gets worse faster.

Is it really true that the vast majority of servers are so memory rich as 
to make that trade-off a no-brainer?  I'd guess not.

s it a good idea to make the *kernel*, of all things, a memory hog and/or
more vulnerable to memory pressure?  In userland you have a lot more 
choices about what performance trade-offs work for your application, but 
you're *going* to be running a kernel no matter what so you have less 
flexibility (and argument that, logically, would argue for a GC/noGC 
compile time option, an interesting idea).

I make the same argument about Java on the server, BTW--I'd have thought 
that the vulnerability to memory pressure wouldn't be a good thing for a 
server.  But it's possible that I don't know what the real bottlenecks 
are.

For my own purposes, the times I care most about kernel performance is for
responsiveness, and last I checked "real-time" and "garbage collection"  
didn't belong in the same sentence, and even making it tolerable requires
a great deal of sophistication.  Ref. counting, the whipping boy of the
pro-GC fundamentalists, is mostly inherently incremental (the major
exception being naive algorithms that will try to reap an entire list at
once when you release the head).  I am most often willing to trade total
throughput for predictable response, but that is of course purely a matter
of the jobs you want to do.  But I surely wouldn't like a kernel with
*mandatory* GC.

> ...Sure, GC has it's down sides in kernel space as
> well, but there is reason to believe that if done right, it could be a
> good thing.

I could see providing a compile-time option to adjust the type of
collector, but I fear with such a choice you would get neither optimal
responsiveness nor optimal throughput.  I'd be very interested in a study 
to test it, but it would be a *ton* of work.

What I was really getting at was in response to the article and it's 
typical fear of manual allocation.  The argument I was hinting at was the 
same one Linus made against microkernel architecture.  It's easier to just 
do the hard work to make sure the kernel is leak-free than to do the same 
for all of userland, and if you buy Linus' argument the kernel is 
precisely the place where you should least make performance compromises 
then you can only justify moving to GC on the throughput argument.  As I 
was arguing, I don't think that argument is so bulletproof.

GC is a big win for certain allocation patterns and not so much for 
others.  Is the kernel really full of places where manual allocation is an 
impossible burden?  (That's a serious question--I don't know about the 
typical patterns in the kernel.  In fact, all of the questions I posed are 
serious questions.)

Dustin