If you do a Google search for "George Hoffman," the whole first page of hits is for some "Captain George Hoffman" from New Jersey who not only is no relation to me but is also dead. If you just search for "Hoffman," I'm not even in the first hundred hits (I didn't look any farther). If you search for "geh," there are a bunch of hits for the George Eastman House, followed by a lot of stuff that must be total garbage because *it* isn't even in *Engish*.
In fact, the only Google search I know of for me in which I actually appear on the first page of hits is for "George Earle Hoffman" (my full, unfortunate name). This turns up two references to two separately maintained lists of GNUStep volunteers. I'm listed as working on a GNUStep test suite, which must be the result of an e-mail question I sent to the project maintainer in 1994. Let this be a warning to you all: Open Source is catching. It could happen to you!
What's my point? I recently decided that there simply isn't enough with my name on it. In the grand scheme of things, I've done nothing with my life. My existence is meaningless. Compared to the likes of Stephen Hawking, Ani DiFranco, or Regis Philbin, I am as an ant to the mighty cockroach. Now, this is a realization that everyone (except maybe The Regis) comes to at one point or another in their lives. Some people deal with it gracefully, and some people grab automatic weapons and head for the nearest clock tower.
How did I deal with this personal crisis? Well, I'd like to say that I rose to the challenge and adopted a Cause: steadfastly dedicated my life to preserving the rain forests, or feeding starving children, or freeing Kevin Mitnick, or organizing sit-ins or walk-outs or walkathons, or saving baby seals, or clubbing baby seals, or being outraged at people who have genitals or who use them in a certain way, or one of those other things that somehow always results in the blocking off of perfectly good roads which happen to lead where I want to go. But what I actually did was move to Millbrae, buy a red sports car, and start dating a girl who is far too young for me.
I could have sworn there was a point to all that. Oh, yes! I also wrote some code.
You see, you may think that what Be engineers are good at is writing an operating system. This is an imprecise statement. You may think that we are good at doing things the "right" way. This, while arguably true, is also not quite right. What Be engineers are best at is taking some problem which is laughably easy to solve and making it complicated enough to interest us. It's not enough, for example, for us to be able to play a movie. We have to be able to play 30 of them, in parallel, on eight CPUs, while surfing the Web and playing Quake II. (You may have heard this referred to in our marketing literature as "sexy"... we engineers know that's just marketing speak for "butt-ass complicated.") Who buys an eight-processor machine and then watches 30 movies on it all at the same time? Beats me. They told us they could sell it, so we made it.
You may have heard that recently they told us they can sell it even better if it's smaller and can browse the Web really, *really* well. Web browsers are complicated beasts, but not nearly complicated enough for us. Our browser has to be totally ripped. So a number of us have been working on the Next Generation BeOS Browser, based on the latest Opera Browser technology, and oh man, it doesn't get much more "sexy" than this.
One thing the astute engineer will notice about the Web is that it is fairly large. There is thus a lot of data that can be attained by examining it. Furthermore, this data comes in small packets: things like GIF images, HTML pages, style sheets, Flash movies, etc. For the traditional monolithic browser, this kind of data is not too hard to manage, but for a highly multithreaded browser like the one we're creating, there arises a question of locking granularity. If I'm accessing a given resource, say, a GIF image, in one thread of a multithreaded application, how do I control access to that resource? I need a way to guarantee that there will be no destructive interference between two threads that desire access to a resource.
I could use semaphores, of course. Let's say I decide to give each resource a semaphore, and use it to mutex access to the resource. There are two problems with this approach: one, if access to the resource happens often, I may take a sizable performance hit, because I need to enter the kernel twice for every pass through the critical section. Second, I use up a semaphore, a kernel object, for each and every resource. Now, I could switch to the benaphore, a locking primitive invented by our own Benoit Schillings, which has been discussed in this newsletter many times. Benaphores use an extra atomic variable to eliminate the need for entering the kernel in low-contention situations.
This does much to help solve the first problem, especially as contention will be low for this type of fine-grain locking, but the second issue is still just as bad. Solving this second problem is of great concern when dealing with the kind of small devices to which we're expanding our platform support, as RAM is a valuable and scarce resource on these devices. Each semaphore takes up 33 bytes at a minimum, plus whatever overallocation malloc makes for a structure of this size, plus four more bytes in the kernel for a pointer to the semaphore in the global semaphore list and another eight in user space to store the semaphore ID and the benaphore atomic variable. This adds up to around (depending on the behavior of malloc) 52 bytes for every benaphore. There's also a hard limit, set by the kernel, on the number of semaphores that can be allocated.
This is a locking primitive we're talking about here. Not a terribly exciting piece of code. But let's make it as complicated as possible: let's say I have a bunch of resources—oh, say, twenty thousand items -- and I need to be able to lock each and every one of them separately. Memory is precious. Contention is low. What's the ideal solution? In this situation, semaphores (or benaphores) aren't even an option without some changes to the kernel. I cleverly chose the figure twenty thousand for a reason: the BeOS kernel allows a system-wide maximum of 16384 semaphores in any configuration. Even if we could use benaphores, by our estimation the benaphores alone would consume almost a megabyte of RAM!
We could use some kind of many-to-one scheme where one semaphore is used to lock many resources. But then we may be locking things we don't need, and we may introduce dependencies between resources that we'd rather avoid. What we'd like is an ultra-lightweight locking primitive. It needs to take up the minimum amount of memory, perform well in the anticipated situation (i.e., low contention), and, above all, it needs to include my name somewhere in it.
Enter the gehnaphore! The gehnaphore is a locking primitive that uses
only a single 32-bit atomic variable. For synchronization, it uses the
fairly obscure send_data()
/receive_data()
API. These calls provide a
one-message-deep message queue which is associated with each thread. Some
people look at these calls and see a largely deprecated API which was
subsumed by the generalized port API. I, on the other hand, see an
opportunity to flex my monstrous ego.
Gehnaphores use the compare-and-swap operation to perform atomic updates to the 32-bit variable. This operation can easily and efficiently be implemented on multiprocessor PowerPC or Intel machines, and actually has its own instruction on the CISCy Pentium. The code I've included only has the Intel implementation, but the PowerPC version is equally trivial.
There are two trade offs to using gehnaphores. First, you probably should
not depend on being able to play nice with other code that uses
send_data()
/receive_data()
; this should not be a big concern in most
code. Second, and more importantly, there will be a performance hit under
contention: if there are N threads waiting for a gehnaphore when it is
released by the holding thread, there will be N context switches (or N-1
more than you might expect) before the next thread in line can access the
critical section. This is because threads in the queue must "chain" the
lock release to the first thread in line for the lock, which is actually
the *last* thread in the chain. Their performance is thus virtually
identical to benaphores for a one or two thread situation, but gets worse
and worse as the amount of contention goes up. This is a significant
enough reason to not use gehnaphores for general purpose locking that
might be subjected to high contention, but is acceptable under the
constraints we've been given.
Gehnaphores use only four bytes per lock, and require kernel calls only when they cause blocking. They thus have the performance advantages of benaphores (aside from the caveat listed above) but use one thirteenth of the memory. They also do not require a kernel call to create and destroy a semaphore (the 32-bit value need only be initialized correctly), or take up an entry in the kernel semaphore list. They are probably ideal for most ultra-fine-grain locking situations. Basically, they totally kick the benaphore's ass. Heh, heh, heh...
I've also written a multiple-reader/one-writer gehnaphore which uses a whopping sixteen bytes. For very fine-grained locking, something like this is probably overkill, but there might be situations where it would make sense to use them. I'll leave it as one of my famous "exercises for the reader" to implement this (you can put the solution in the same directory as all the code you wrote to extend QuickPaint™).
The code follows. You'll need to pull out the assembly snippet into a .S
assembly file and link with it in order to get the compare_and_swap32
function. Remember: changing the name of the class is a violation of the
licensing agreement. Yeah, I know you were planning on it.
#include <OS.h> #include <SupportDefs.h> /* # # int32 compare_and_swap32(int32 *location, int32 oldValue, int32 newValue); # .align 8 .globl compare_and_swap32 compare_and_swap32: pushl %ebx # Save these pushl %edi movl 12(%esp), %edi # Get location movl 16(%esp), %eax # Get old value movl 20(%esp), %ebx # Get new value lock cmpxchgl %ebx, (%edi) sete %cl # get success xorl %eax, %eax movb %cl, %al popl %edi popl %ebx ret */ classGehnaphore
{ private: int32m_lockValue
; public:Gehnaphore
() {m_lockValue
= -1; }; voidLock()
; voidUnlock()
; }; inline bool cmpxchg32(int32 *atom
, int32 *value
, int32newValue
) { int32success
= compare_and_swap32(atom
, *value
,newValue
); if (!success
) *value
= *atom
; returnsuccess
; }; void Gehnaphore::Lock() { int32chainedThread
=-1,myThread
=find_thread(NULL
),msg
=-1; boolsuccess
=false
; while (!success
) { while (!success
&& (chainedThread
== -1))success
= cmpxchg32(&m_lockValue
,&chainedThread
,0); while (!success
&& (chainedThread
!= -1))success
= cmpxchg32(&m_lockValue
,&chainedThread
,myThread
); }; do { if (chainedThread
!= -1)msg
= receive_data(NULL
,NULL
,0); if ((chainedThread
> 0) && (chainedThread
!=msg
)) send_data(chainedThread
,msg
,NULL
,0); } while ((chainedThread
> 0) && (chainedThread
!=msg
)); } void Gehnaphore::Unlock() { int32chainedThread
=0,myThread
=find_thread(NULL
); boolsuccess
;success
= cmpxchg32(&m_lockValue
,&chainedThread
,-1); if (!success
&& (chainedThread
==myThread
))success
= cmpxchg32(&m_lockValue
,&chainedThread
,-1); if (!success
) send_data(chainedThread
,myThread
,NULL
,0); }
Part 2 of this series is unfortunately not ready yet. We will publish no code before its time. Please stay tuned to this channel.
I do have one correction to last week's article. The recommended
device-naming scheme begins numbering at "1"; i.e., the CPiA driver
should publish /dev/video/usb/CPiA/1
for the first camera,
/dev/video/usb/CPiA/2
for the second, and so on.
While you're waiting for Part 2, I recommend reading and rereading the article at <http://www.b500.com/bepage/driver.html>. In my humble opinion, it's still the best general treatment of writing BeOS drivers there is. And I was paid nothing for that endorsement.
Also, if you have Maui, you can install the CPiA driver from Part 1 and
make it do a few tricks. To install the CPiA driver, put it in
/boot/home/config/add-ons/kernel/drivers/bin
, then make a link to it
from /boot/home/config/add-ons/kernel/drivers/dev/video/usb/
. In other
words, when you're done you should have something similar to
$ dir /boot/home/config/add-ons/kernel/drivers/dev/video/usb/ total 0 lrwxrwxrwx 1 tt users 0 Mar 22 04:51 CPiA -> /boot/home/config/add-ons/kernel/drivers/bin/CPiA*
P.S.: for Maui, I've added a rule named driverinstall to the
makefile-engine located in /boot/develop/etc
. If you're using the
makefile-engine, and you set DRIVER_PATH
in your makefile to the desired
path of your driver relative to /dev
,
you can type make driverinstall to
properly install your driver. For example, check out the CPiA makefile.
You'll see the line
DRIVER_PATH = video/usb
Once you've installed CPiA, try using BeOS's nifty rescan command on it when you have serial debugging turned on. Invoke it from the command line like so:
$ rescan CPiA
rescan causes the kernel to reload a driver. In so doing it calls several
of the driver's exported functions. With CPiA's copious serial debug
output, you should get a good idea of when and how these and other CPiA
functions are called. rescan takes the name (not the path) of the driver
binary as its parameter. You can also run the test harness (test.cpp
) to
see some of CPiA's hook functions being called.
If you don't have Maui, you can try rescanning whichever drivers you find
in /boot/beos/system/add-ons/kernel/drivers/bin
, but most shipping
drivers are not nearly as verbose as CPiA. Some are downright speechless.
Hot tip: did you know that you can look at serial debugging output even if you don't have anything hooked up to the serial port? The command tail shows you the end of a file, and if you use the -f option it continues showing lines as they are added to the file. If you turned on serial debugging at boot by hitting Delete or F1, you can type
tail -f /var/log/syslog
into a terminal to watch serial debug output as it happens. Cool!
Warm tip: check out the kernel settings file in
/boot/home/config/settings/kernel/drivers/sample/
. If you make it an
active settings file by moving it up to the drivers directory, you can
control the behavior of syslog and serial debug output without having to
remember which keys to hold down when.
As you can imagine, I get a good deal of e-mail. Real e-mail-- what's left after a CTRL-T on offers for descramblers; various life-enhancing regimens, substances, or surgery; get-rich schemes; spying or counterspying techniques. When we screw up, I hear about it, and I'm grateful. Strongly worded feedback gives us a chance to correct a specific situation, or to discover a more systemic problem. Silent spite can kill a business, so I view complaints as opportunities.
Other mail is congratulatory. Since we know we still have much to do and much to prove, encouragement along the way is appreciated. Such mail sometimes comes from unexpected places: recently we got mail from the Greek army. We also get suggestions, ranging from porting our OS to various processors to writing drivers for a hot card to helping revive the Amiga. These, too, are much appreciated. We don't agree with all of them; some we strongly disagree with, because we must focus our resources according to certain priorities.
In any event, we welcome the input for several reasons. For one, we are, of necessity, myopic, and having an outside perspective is healthy. Furthermore, how many good ideas does one really need per year? One or two, maybe. I know it's fashionable to complain about e-mail volume, but personally, I prefer to filter it myself and retain the portion that brings feedback and ideas.
With this in mind, I'll share part of a message I got from Steve Klingsporn on his Top Ten Reasons for liking BeOS (at some point, I'll get to his unabbreviated list). Steve worked at Be many moons ago and has kept a strong, informed interest in our work. Steve had this to say about the introduction of BeOS 5...
"If I worked for Be, I would do the following:
Introduce BeOS 5 with the free version amidst as much fanfare as possible.
Spend a little money on web and print and maybe radio advertising—plenty of volunteers available to help -- to get the word out.
Put out press releases touting the power in having the same operating system on recent PCs and on new types of devices/IAs.
Follow up strong on the BeOS 5 free release with the release of BeOS Pro and BONE/OpenGL/Java/etc., two to four months thereafter.
Continue pursuit of BeIA adoptees. Move BeIA technologies to BeOS where appropriate and have common/new APIs.
Tie the desktop and device versions together in a really data-compatible way, maybe with a unified object/messaging format."
This is very much our philosophy for introducing BeOS 5. Details may vary here and there, but basically Steve is right. We're proud of our product and we want BeOS 5 to reach as many current and potential new users as possible. In doing this we want to broadly advertise our technology, including its strengths for media-rich Internet Appliances. As one of our recent press releases claims, we have more than 100,000 preregistrations for the free, downloadable version of BeOS 5. We look forward to a good launch. And lots more e-mail...