Lately, it's been fashionable to write about the driver model and API in our newsletter. Being very hip, I couldn't possibly avoid the subject.
We've had several articles on the basic rules a driver writer must follow -- rules that apply to all kernel code and that define the kernel programming model. Still, too often I run into code that breaks them -- and it's not only third-party; I've found questionable code in house as well. Perhaps the blame lies with my broken English—but I'll try again to explain the basic rules. This time I'll tell you why you have to respect them and what happens if you don't. I hope that will help you to memorize them. If not, at least the details of how the kernel works should make the curious developer happy.
In the rest of the article, I'll use the terms "interrupt handler" and "spinlock section." By interrupt handler, I mean a routine that serves either IO interrupt (installed using install io interrupt) or timer interrupt (scheduled using add timer). By spinlock section, I mean the critical section protected by a spinlock.
If you have questions or comments about this article, don't hesitate to e-mail me: cyril@be.com.
It is illegal to cause a preemption while in an interrupt handler or
spinlock section. A preemption can occur, for example, when release sem()
is called without the B_DO_NOT_RESCHEDULE
flag. Instead, the handler
should return B_INVOKE_SCHEDULER
if preemption is desired after the
interrupt has been processed—say because a semaphore was released.
Preemption can also occur when interrupts are enabled. This explains why
it's necessary to disable interrupts before acquiring a spinlock—and
restore them after releasing it.
Depending on the exact context, various *bad things* can happen if you break this rule.
With interrupt handlers:
-- IO interrupt handlers: with level triggered interrupts, the CPU can take new interrupts of the same level only after the interrupt handler returns. In other terms, interrupts of the same level are masked for the time the handler runs. If the handler is preempted, there is no guarantee when it will resume execution. The interrupted thread may be very low priority for example, in which case it can wait tenths of a second before it is run again. Blocking interrupts for that long is *very bad*.
-- Timer handlers: if a timer handler is preempted, the other timer handlers that were scheduled to run at the same time will run only after the preempted thread resumes execution, which can be tenths of a second later, as we've seen. That would also have terrible effects on interrupt latency, which is currently below a couple of hundred microseconds.
With spinlock sections
-- Preemption in the middle of a critical section can lead to deadlocks. The classic priority inversion problem, for example, leads to a system blockage when using spinlocks instead of semaphores. Imagine thread A, priority 10, grabs a spinlock. It is preempted in the critical section. Thread B, real time priority 100, attempts to grab the spinlock and loops while waiting for it. Because B is real time and does not block, the scheduler will never run A again. Hence the deadlock.
Just as interrupt handlers and spinlock sections can't be preempted,
neither can they block. Blocking can occur directly by calling
acquire_sem()
. It can also occur indirectly as the result of invoking a blocking
kernel call—like malloc()
or read_port()
—or even by touching
unlocked memory, which causes the thread to enter VM. The list of calls
that can potentially block is long, and includes some you wouldn't
suspect. For example, delete_sem()
blocks because
it calls free()
to free
the semaphore structure. remove_io_interrupt_handler()
blocks because it
waits for any executing instance of the removed handler to complete
execution. And so on. Therefore, as I've written in previous articles,
don't invoke any function that is not explicitly allowed. You can find a
list of permissible functions in my article
"Be Engineering Insights: Attention Driver Writers!."
If you feel you really need to make a call that's not on the list, the best thing to do is e-mail me about it.
Here are some insights about what can happen if you don't respect this rule:
With interrupt handlers: The scheduler is designed so that there must always be one null thread ready to run per CPU in case all other threads are blocked. If the interrupt handler that blocks happened to interrupt a null thread, it won't be ready to run any more—and the kernel will most likely die a horrible death.
Also, blocking in interrupt handlers would break the semantic of thread priority. For example, the highest priority real time thread is guaranteed to run until it blocks. This would no longer be true if it was interrupted and blocked by some interrupt.
With spinlock sections: Similar to preemption, blocking also causes a deadlock in the classic priority inversion problem. This is the same scenario that was discussed above.
Another scenario that leads to a deadlock, but involves VM is this:
thread A touches unlocked memory in a spinlock section. It enters VM
that needs to page in the frame. VM calls read()
on the disk driver,
which programs the IO and waits for the disk interrupt. Then thread B
runs and tries to grab the spinlock, with interrupt masked, of course.
The processor will never be able to service the disk interrupt. We have
a clear deadlock.
A variation of this scenario possibly applies to a misbehaved IO interrupt handler as well. For example, if the interrupt handler serves a device on PCI (level-sensitive interrupts) and has the same interrupt priority as the disk device (PCI also), then the disk interrupt cannot be propagated to the CPU until the interrupt handler returns, which won't happen until the page fault is processed...
Spinlocks must be initialized to 0 before they are used. Every call to
acquire_spinlock()
must be paired with a call
to release_spinlock()
on
the same spinlock. This defines a critical section. If you nest
spinlocks, i.e., define critical sections within critical sections, it is
essential that the spinlocks are released in the opposite order from
which they are acquired.
The reason for this is that the kernel keeps track of which spinlocks are held and which are being waited on. The logic that does the tracking expects spinlock to be used in a mutex manner, or to put it differently, that spinlocks are used to protect critical sections. This implies that spinlock must be initialized to 0, that acquiring and releasing it are exactly paired, and that nesting is done in an ordered manner.
You may wonder why the kernel keeps track of spinlocks. On a multiple CPU system, deadlocks can occur under some circumstances involving intercpu interrupts. The kernel has to do the tracking in order to detect and break those deadlocks.
Imagine a dual CPU system. CPU 0 acquires spinlock S. CPU 1 tries to acquire S, but keeps looping because S is held already. Then CPU 0 sends a intercpu interrupt to CPU 1. This means that CPU 0 signals CPU 1 and waits for the interrupt to be taken. But interrupts are masked on CPU 1 because we are in a critical section. The system is deadlocked. Other, more complex, scenarios involving intercpu interrupts, more spinlocks, and/or more CPUs also lead to deadlocks.
Those deadlocks are dealt with in acquire_spinlock()
. When a cycle is
detected in the dependency graph, acquire_spinlock()
"flushes" the
intercpu interrupt by executing the service routine. This releases the
CPU waiting on the interrupt to be processed and doing this it breaks the
deadlock.
I have embarked on a bold experiment. I have forsworn all soda (I already abjure coffee), I eat no snacks while programming, and I consume small, balanced meals. The idea is that soon enough I will be fit and trim: a svelte coding Adonis. Of course, my productivity dropped to almost nil for the first few days of my new diet, but I am slowly learning to coax my brain into something resembling functionality without the use of artificial stimulants or buckets of microwave popcorn. It is a liberating experience—I highly recommend it. Unless you have any deadlines coming up soon...
The deadline for this newsletter article crept up on me and pounced like a Stanford senior at Full Moon On the Quad. None of the scraps of sample code I've been working on were ready, and 4.5 was not a heavy feature release, so I had no new features to blab about. This being the case, I was unprepared to discharge my obligation to you, my loyal legions of fanatically devoted readers, with an article of the quality you have come to expect (i.e., good).
My friends, I felt guilt. I felt shame. I very nearly felt a deep sense of despair. I was confronted with the realization that I would have to Let You Down with some goofy article about C++ coding style, of the type Pavel usually subjects you to.
Then I realized, with dawning consciousness, that you ungrateful scrubs never read my newsletter articles anyway! Not a week goes by that I don't read about or hear some developer complaining about a missing feature which I implemented (and documented in a newsletter article!) many months ago. Book 'em! Ignorance of the API is no excuse!
Yeah, yeah, I know it's not in the Be Book yet, punk. Are you here to complain or are you here to code? Read the newsletter. Read the headers. Have some friggin' balls! Slam it on the table, open yer wussie electrocuted-C-mode text editor, and get to work!
So anyway, in my righteous wrath, I decided to use this article to answer some frequently asked questions about the app server and related low-flying phenomena. Feature requests. Bug reports. I'm keeping the communication lines open. I want us to have a trusting, caring relationship.
I hope this answers some of your common questions. If you have further questions or concerns, please feel free to contact me (geh@be.com).
Recently, the DTS oracle has been peppered with questions about messaging. One question that popped up was: "OK, all this sending of messages to one target is well and good, but what if I need to send a message to multiple targets?" This resulted in the rallying cry that I adopted as the title of this article—inspired by the BeBox slogan from a few years ago.
There are several ways to solve this problem, depending on what your
requirements are. One way is to keep a list of BMessenger
s around, one
for each target, and when you want to send a message, simply tell each
messenger to send the message for you. Of course, it would be nice to
encapsulate this functionality in a class, so I have:
<ftp://ftp.be.com/pub/samples/application_kit/MultiInvoker.zip>
This little piece of code implements a class that looks much like a
BInvoker
, and like BInvoker
,
it works well as a mix- in class. Its
interface differs slightly from BInvoker
, however, because instead of a
single target, it maintains a list of targets. Appropriately, when you
tell it to Invoke()
, it simply runs through the list of targets and sends
the message to each target.
Note that you can either add a target using the standard
BHandler
/BLooper
pair of parameters, or you can create a BMessenger
dynamically and hand
it off to the MultiInvoker
(the
MultiInvoker
deletes the messenger when
it's done). The great advantage to the latter approach is that, if you
have a specially derived BMessenger
(e.g., one that sends messages to a
target across a network), you can simply toss one of those puppies to a
MultiInvoker
and it will Just Work (tm)—an extra piece of flexibility
that BInvoker
doesn't provide.
This approach works well for any Observer patterns in your code (this
terminology comes from "Design Patterns," by Gamma, Helm, Johnson, and
Vlissidies, one of my all-time favorite books about programming). For
example, the sample code archive for this week contains a simple test app
with several observers. The application creates an Observable object that
derives from MultiInvoker. When the time comes for the observable to
broadcast an update, it simply calls Invoke()
with a message that
describes its state. The messaging mechanism inherent in MultiInvoker
makes this approach particularly well-suited for multithreaded
applications.
One limitation to this approach, however, is that the sender has to know
who the targets are. Depending on your situation, you may want
flexibility on the receiver's end about who receives the message—for
example, if you're sending a message to a Mediator
object, and you want
the mediator to dispatch the message to the appropriate targets. That
limitation can easily be overcome as well, by creatively using BLooper
s
or BMessageFilter
s—but that's a topic for another time...
One of the main tasks of Be's Developer Technical Support Team is to provide sample code to help you (our developers) ship your applications. Our sample code library has grown substantially, but most of the code is in the form of simple applications or utilities that demonstrate interesting features of our APIs. These apps are generally limited in scope; they detail how to perform queries, print, use menus, and so on.
Unfortunately, such code doesn't adequately demonstrate how to write real world applications that deal with many of these areas combined. Also, parts of our APIs aren't in sample code at all, because they're more useful as the complexity of applications increases.
The purpose of this new column, "Bit by Bit" is to construct real apps over the course of many articles. Each article will introduce a BeOS programming concept, and will gradually advance the complexity of the application. This iterative process should allow us to cover each subject simply but thoroughly. It will help you to understand not only the concepts presented, but also how they fit together in the overall structure of the application. The idea is to build an application that does everything the right way.
Next time I'll start our first application—Rephrase, a text processing
application based on BTextView
. Rephrase won't be a threat to other word
processing apps, which generally implement their own text engines—but
look out StyledEdit!
The first installment will focus on the fundamentals of the BeOS programming model: the application, the window, and the view.
See you next week!