Note: If you use the NMI button on a BeBox the NMI exception is not precise and the kernel may not be able to continue executing after you exit the kernel debugger. In practice this doesn't happen often.
When I first came to work at Be I was surprised by the complete lack of kernel debugging facilities. At the time there was nothing but a printf that went out to fourth serial port on a BeBox. Compared to the luxurious debugging environment in my prior job, the crudity of the Be tools came as a bit of a shock.
After the shock wore off I realized two things. First, you can't debug
deadlocks very easily with printf()
. Second, why else do you go work for
an OS company? So you can fix things that bug you instead of just
complaining about them! I set to work during the Christmas holidays of
1995/96 and pretty soon had a crude "debugger" that let me poke around
the system, examine data structures, and conduct other nefarious
debugging activities.
This article isn't about a trip down debugging nostalgia lane, however. It's about how to debug things that run in kernel mode—so let's get down to it and see how it's done on the BeOS.
A word of caution: Those of you who are accustomed to SoftICE or other sophisticated Windows debugging tools may be in for as much of a shock as I was when I started at Be. Our tools are still fairly primitive by most standards—but they serve us reliably and produce good data to help in debugging. While I believe that tools can help you debug, in the end there's no substitute for just sitting down and figuring out a problem.
First off, where does the debugger output go? On an Intel Architecture (x86) PC our serial output goes to COM1 at 19,200 baud, 8 data bits, no stop bits, 1 parity bit, and no flow control at all (this is important). On the BeBox we use the fourth serial port (same settings) and on the Mac we use the modem port (again, same settings).
To interact with the debugger you need to get a NULL modem cable and connect it to another computer. The BeOS kernel debugger sets the serial port to 19,200 baud, 8 data bits, no parity, and 1 stop bit (commonly referred to as 8-N-1). There is also no flow control, so it's important to disable flow control in whatever software you're using on the remote machine. In the BeOS SerialConnect application you can do this by choosing the Settings->Flow Control->None menu item. Once you configure the remote machine and its terminal program you need to enable the debugging output on BeOS. On a PC or BeBox you do this by holding down the F1 key when you boot; on the Mac hold down the Delete key. When our nifty boot logo appears, press either F1 or Delete until debug output appears on the serial port. If you see only garbled characters on the remote side you probably have the wrong modem settings (again it's 19,200 baud, 8-N-1, and no flow control).
Once the serial connection is set up correctly, there are several ways to
enter the kernel debugger. On a PC or BeBox you can press
Alt+PrintScreen
to "drop in." On the Macintosh press
Command+Power key. The other (much
less pleasant) way to enter the kernel debugger is if the kernel takes a
fault it can't handle. This can happen if code running in the kernel
touches nonexistent/unmapped memory, divides by zero, if an NMI
(non-maskable-interrupt) occurs, or some other fault that can't be
handled happens. Finally, you can enter the debugger if a driver
explicitly calls kernel_debugger()
to force entry.
However you get there, when the kernel debugger takes control, the first CPU in the system is effectively frozen (the mouse stops, no threads are scheduled, and interrupts are disabled). In an MP environment the other processors may continue executing (assuming they don't block on some resource held by the CPU already in the kernel debugger). The only case when all CPUs will drop into the kernel debugger at the same time is when an NMI exception occurs (it's easy to make an ISA card with a button that can generate NMI exceptions).
In any case, when the kernel debugger takes control it prints the following message on the serial port:
Welcome to Kernel Debugging Land: Running on cpu 0, iframe @ 0xfc023418 kdebug>
From there you can start entering commands. The kernel debugger has a number of built-in commands that you can always depend on. Some of these provide basic functionality, for example:
ps - show which threads are running sc - print a stack trace for a specific thread dis - disassemble code regs - dump the current registers dm - dump memory
There is also a help command which lists all the commands the debugger knows. You should read through this list. Many of these commands are esoteric and are useful only if you're debugging the kernel. Some informational commands such as "sem", "port", and "thread" can be of general use, however. The kernel debugger also accepts partial matches for command names, so you can enter shorthand and it will pick the first command that matches.
To exit the kernel debugger, enter the "continue" command or the abbreviation, "c". When you exit the system returns to normal if it can (assuming a kernel fault wasn't the reason you dropped in). You can drop in and continue from the debugger as often as you want.
Of course, when you're in the kernel debugger interrupts are disabled and the system is frozen—so devices that require interrupts to be serviced in a certain amount of time may become unhappy.
Note: If you use the NMI button on a BeBox the NMI exception is not precise and the kernel may not be able to continue executing after you exit the kernel debugger. In practice this doesn't happen often.
The kernel debugger's command prompt has some convenient features. First, there's a full expression evaluator available so you can enter a C-style expression and find the result. Expression handling supports all C arithmetic and boolean operators, so if you're looking at code in one window and debugging in another you can enter values and evaluate expressions such as:
(12345 << 13) | 0x1201
This is handy when you're debugging OS code and you need to piece together physical addresses, etc. The results are also printed in hex, decimal, and ASCII (if appropriate), which makes this a convenient way to convert between number bases on the fly.
You can also use up to 128 variables to hold common values. For example,
if you're debugging a driver and your driver prints the base address of
your device (let's say it's at 0x10908000
) then you might enter the
expression
board=0x10908000
so you can easily refer to the board without having to remember the address. After entering the above expression you could dump some registers on the board by entering the command "dm board".
The other nice feature of the Be kernel debugging environment is that code you load into the kernel (i.e., a driver, module, or bus manager) can add commands to the debugging environment. Adding debugger commands can significantly boost your ability to debug a driver or module. By adding commands to dump out the state of structures and/or the hardware you're developing for, you can quickly find out why a driver is misbehaving, analyze deadlocks, and examine the state of programs using your driver. When I developed BFS I also wrote a set of debugger commands to enable me to diagnose problems and snoop on the state of things. You can see some of the commands I added if you drop into the kernel debugger and look for the file system related commands (bfs, binode, dstream, sdata, and btree). Using those commands along with the debugger's built-in commands I was able to debug a 20,000+ line file system very easily, which would not have been possible without the debugger commands.
Adding a debugger command is simply a matter of calling
intadd_debugger_command
(char *name
, int (*func_ptr
)(intargc
, char **argv
), char *help_text
);
An example would be
add_debugger_command
("foobar",do_foobar
, "foobar [num entries]");
which would add the command "foobar" to the debugger. When someone enters the command "foobar", the function do foobar is called.
The arguments to your command are regular argc
/argv
style (just like
main()
). This makes it easy to have debugger commands with options and
switches. Debugger commands also have access to the function
parse_expression()
to evaluate numeric arguments (this is the same parser used
to evaluate expressions at the debugger command line).
There are a few constraints when you're writing a debugger command. First, your command runs with interrupts disabled. This means that it can't call typical system services normally available to a driver. The constraints are these:
To print output you must use kprintf()
.
You cannot call any semaphore routines, allocate memory, or call other typical driver routines.
You can call strcmp()
, strlen()
,
memcpy()
, memcmp()
. Other C library
functions such as sscanf()
are definitely off limits.
Once your command has control it can do anything it wants, within the
rules outlined above. This offers many possibilities. The return value of
your debugger command can also affect how the debugger behaves. If you
return zero, nothing special happens. If you return the value
B_KDEBUG_CONT
, the debugger will remember the command and re-execute it if the
user presses return at the next prompt. For example, the "dm" command
returns B_KDEBUG_CONT
, so if you start dumping memory you can continue
dumping successive chunks of it just by pressing return. This is useful
if you're stepping through a linked list or dumping out a lengthy data
structure one piece at a time.
If your command returns the value B_KDEBUG_QUIT
, the debugger will exit
and continue normal execution of the system. This is useful if you want
to have a command that checks the state of some data structures, and if
they're ok, automatically returns control back to the system when it's
done.
One final note about adding debugger commands: When your driver or module
is unloaded you should be a good citizen and call the function
remove_debugger_command()
to insure that there is no reference to your driver's
code still in the debugger's data structures. You can find the prototype
for this function in:
/boot/develop/headers/be/drivers/KernelExport.h
Now let's shift gears and discuss how you use the debugger to debug problems. One common task is to try to diagnose a deadlock. Generally, the system is hung and you're called in to see what's wrong. Assuming a normal semaphore-based deadlock, you can just press the kernel debugger hot key (Alt+Printscreen on a PC or BeBox and Command+Power key on a Mac). Once in the debugger, the first useful thing to do is to enter the "ps" command. This gives you a list of threads that are running. In the rightmost column is the name of the semaphore that a thread is blocked on (if it is blocked on a semaphore). The semaphore name is a clear indicator of what a thread is blocked on (especially if you named the semaphore!). After picking out the threads blocked on semaphores related to your driver or module, you can use the "sc" command to get a stack crawl. This is usually enough to cause a slap of the forehead and a loud exclamation of "DOH!" If not, a little time studying the code path that led to the situation is often sufficient to induce the forehead slapping response. Sometimes deadlocks involve more than just two threads, in which case some head scratching may occur. Still, the "ps" command and "sc" are powerful ways to debug deadlocks.
One additional debugger feature aids in debugging a driver—the function
intload_driver_symbols
(const char *driver_name
);
is a convenient way to have the kernel load the symbol information for your driver so that when you get a stack crawl it will print the symbol names of your functions (if they are not static). This, combined with the "dis" command to disassemble memory, is a handy way to pinpoint where you are in your code.
However, even with symbol information loaded it can be tedious to determine the exact line of C code that corresponds to a particular assembly instruction. Matching a faulting instruction with a specific line of high-level code is one area of our debugging environment that I would like to see improved.
Another common error is that a driver will take a fault if a driver touches an unmapped piece of memory. If the kernel takes a fault it can't handle and it is running in kernel mode, this is considered a catastrophic error and the system drops into the kernel debugger automatically to prevent any further damage. When the kernel debugger takes control in this situation the interrupt frame, or "ifr", is very important, because it contains information about the address of the faulting instruction and the registers in question. Typing "ifr" dumps out the state of the current interrupt frame. The instruction pointer that faulted is in the interrupt frame (it's the eip register on x86 and the pc register on PPC). To find out where the instruction pointer is, use the look-up function, "lkup". Type "lkup <instruction-pointer-address>" and if the kernel debugger knows where that address is, it will tell you the name of the symbol corresponding to that address.
If you do have symbols, the "sc" (sc == Stack Crawl) command will tell you the calling sequence that got you to where your driver crashed. Following that the "dis" command disassembles around the faulting instruction and is helpful for locating where the code crashed (calls to other functions show up in the disassembly and help map to where in the source code the fault occurred).
If there is no symbol associated with an address, the "addr" command may be useful. Type "addr 0xYOURADDRESS" and you'll get some information about the address, including the area name that it lives in. Often if a random driver just crashes we use this command to pinpoint the guilty party and then track the problem down on a second machine by disassembling the driver with "objdump -d".
Good debugging skills take time to acquire but when you learn how to use our debugger you can make quick work of typical bugs using the few simple but powerful tools described above. I hope this brief introduction to our kernel debugger will help driver, bus manager, and module writers to understand the basics better, so they can debug their kernel mode code more efficiently.
Happy bug hunting!
The kernel team has done a good job recently of describing various aspects of kernel programming. But although it's been implied, no one has come right out and said that you can use the kernel's niftiest module mechanism as a way to extend its logical feature set. In other words, bus managers are fine, but what you really want is...an Atomizer!
Eh? Ok, so maybe *you* don't want an atomizer, but somebody else might. Just so that we're all on the same page, I'm using "atomizer" in same sense that the X Windows folks do. An "atom" is a token that represents some other (usually larger) thing. In our case, it's a UTF8 string. Atoms are unique, in that given some number of copies of the string, there will be exactly one atom that represents the strings. A trivial example: atomizing "foo" might return 1, atomizing "bar" might return 2, but atomizing "foo" again will always return what it returned for the first atomization. In this case, 1.
Now, you could create such a service in a shared library, but to be useful, atoms should be shared between applications. You could create a server to cough up atoms on demand, but then they wouldn't be available to kernel drivers and other modules. In my mind this makes an atomizer a decent candidate for a kernel level system service.
There are good reasons to make a new system service like this a module instead of linking it into the kernel:
Modules can be dynamically loaded and unloaded, as needs require.
If a client of a module can't find the module it needs, it will still load, possibly looking for an alternative way to get the features it needs or modifying its behavior to account for the absent module.
In the archive for this article <ftp://ftp.be.com/pub/samples/drivers/atomizer.zip>, you'll find the source code for a module, a driver, and a simple test program.
The module is where most of the interesting stuff is, in that it
implements all the gritty details of atomization. It supports multiple
atomizers (in case you don't want to share), and various functions for
getting info in and out of the atomizer. The API documentation is in the
header atomizer.h
.
The driver is mostly an empty shell whose purpose is to provide an API
for user space programs, since user space apps can't talk to modules
directly. The user/driver API is in
atomizer_driver.h
. The driver code
shows how the module API should be used.
The test application trivially exercises the driver, and thus the module.
The only feature of general interest in using the driver is the fact that
the first four bytes of each structure passed to ioctl()
must match the
driver's notion of a magic number. This helps ensure that some program
isn't feeding garbage data to the driver. It's not foolproof, but it
doesn't cost much to check.
If you're one of those folks who browse the header directories for each
new release, you may have found preliminary versions of
atomizer.h
and
atomizer_driver.h
in the R4.5 headers hierarchy. Please ignore them, as
there is no atomizer module or driver shipped with R4.5, and hence
they're not officially part of the release. If the module turns out to be
useful, we'll ship it with a future release.
As a side note, all work and no play makes Trey insane. That's why I'm not writing about graphics drivers in this article. If you have an interest in writing graphics drivers for the BeOS, check out the R4GraphicDriverKit in the /optional
Last week I introduced you to the BeOS messaging system, and described how messages are sent in explicit, vivid, and sometimes violent detail. This week's article continues in the same vein, showing what happens on the other end, where the message is received.
If you haven't read last week's article yet, I recommend that you do it now, or you may find the learning curve on this week's article a bit steep:
Developers' Workshop: The Magic of Messages Part 1: The Sending
When we left off, our message was sitting as a disassembled stream of
bits in some BLooper
's port. Now, it's
time for the BLooper
to do
something with this message.
The looper's main loop performs this cycle of steps ad infinitum (or until it's shut down):
Read data from a port.
The data that it's receiving is buffers of
flattened BMessage
data. BLooper
s
actually only read one message from
the port at a time, whereas BWindow
s read all the messages in a port at
once and handle them in one big batch. There's a slight difference in
performance between the two classes because of this.
Construct BMessage
s from the data.
This is done by creating a new
BMessage
, and calling Unflatten()
on it. The BMessage
that results is
almost an exact copy of the BMessage
that was passed to
SendMessage()
/PostMessage()
on the other end (the difference being that it
has some additional information identifying its target and where
replies should be sent). The new BMessage
is owned by the looper.
Stick the BMessage
s into a message queue.
The message queue is an
object of type BMessageQueue
; it serves as a temporary holding pen for
BMessage
s. BMessage
s
thrown into the queue stay there until the BLooper
gets around to dispatching them. You can inspect—and change!—the
contents of the queue by calling BLooper
::MessageQueue()
.
Remember that sneaky shortcut for sending messages that I mentioned
last week? Well, once you know that you can get at this message queue,
it's theoretically possible to bypass the messaging system and simply
drop messages straight into the BLooper
's message queue, like this:
BMessage
*msg
= newBMessage
(B_QUIT_REQUESTED
);window
->MessageQueue
()->AddMessage
(msg
);
By doing this, you bypass most of the messaging system and the
associated overhead from flattening and copying data. Moreover --
though this is not always what you want—you ensure that this message
is handled before any messages that arrive via the looper's port,
because the looper dispatches all messages in its BMessageQueue
before
going back to retrieve more messages from the port.
This shortcut is actually how PostMessage()
used to
behave. We changed the behavior of PostMessage()
when we realized that
this technique affected the order in which messages were handled,
sometimes in surprising ways. So, unless you have a good reason to use
this optimization, we recommend avoiding it. Using PostMessage()
and
SendMessage()
is safer, and you're more likely to get the results that
you expect.
Dispatch all BMessage
s in the message queue. This is where the most
interesting stuff in the messaging system happens. In the next section,
we'll examine what the dispatch pipeline looks like for BMessage
s.
Once your message has been received by the target looper, it must go through an arduous process to reach its final destination. This process determines what the final target of the message should be. At almost every point in the process, it's possible to "drop" the message, in which case the message dispatch stops, and we skip straight to Step 5. Here are the approximate steps involved, in the order they are performed:
Start with the intended handler. The intended target of the message
is now contained in the BMessage
itself. At this point that target is
retrieved, and if it's valid, it becomes the current target of the
message. If it is no longer a valid target, the message is dropped.
Resolve the preferred handler. If the message was intended for the looper's preferred handler, the message's target is set to the preferred handler. As I mentioned before, if there is no preferred handler, the message's target becomes the looper itself.
Determine the target of drag 'n' drop messages. If the message happens to be a drag 'n' drop message delivered to a window, the window will determine which view lies beneath the dropped object. This view becomes the new target of the message.
Resolve any specifiers in the message. Specifiers are special fields
of the message that BeOS scripting uses to identify targets. The looper
calls the current target's BHandler
::ResolveSpecifier()
, which is a hook
function that examines the specifiers and determines where the message
should go. ResolveSpecifier()
returns
a BHandler
that should be the new
target of the message. The looper then calls that handler's
ResolveSpecifier()
, and keeps iterating like this until the target stops
changing or until a NULL
BHandler
is returned (at which point the
message is dropped).
Let all of the looper's filters take a crack at the message. Next, we
let BMessageFilter
s handle the message. The job of these objects is to
examine the message and adjust the message's target if necessary. You
can use BMessageFilter
s to change the dispatching behavior of loopers
or handlers dynamically and easily, which makes them extremely useful!
Each looper has a list of BMessageFilter
s, called the common filter
list. For each BMessageFilter
in the list, the looper calls their
Filter function. Each BMessageFilter
can examine the target and change
it if they want; that new target is passed along to the next
BMessageFilter
in the list. Any BMessageFilter
can return B_SKIP_MESSAGE
if they want, at which point the message is dropped.
Run through the handlers' filters. Here, a rather complicated
iterative process ensues. Each BHandler
potentially has its own list of
BMessageFilter
s, which filter messages intended for that specific
target. The idea is to keep filtering the message using the target's
message filters until the target stops changing or until the message is
dropped. More specifically, the looper does the following:
Get the current target's list of BMessageFilter
s.
Run through each BMessageFilter
in the list, allowing each one
to specify a different target or drop the message.
Assuming that the message makes it through the list of filters without being dropped, if the target that the last filter returns is the same as the current target, then stop. Otherwise, make that target the new current target, go back to (1), and repeat.
Dispatch the message. Finally, DispatchMessage()
is called. In
DispatchMessage()
, the looper has a final chance to examine the message
and target, and do whatever is necessary to deliver the message to the
target. Generally, this is where the looper intercepts system messages
and do whatever special dispatching needs to be done to them. For
normal messages, the looper will call the handler's MessageReceived()
function—and the handler finally gets a chance to respond to the
message!
Once the message has been handled, the looper's default behavior is
simply to delete the message. This might be bad news if you wanted to
keep the message around for some reason once you've received it. Luckily,
there's a way for you to assume ownership of the message. When you call
BLooper
::DispatchCurrentMessage()
, you become the owner of the current
message, and can do whatever you like with it; the looper will not delete
it. Then, when you're done with the message, be sure to delete it—not
just to avoid memory leaks, but because there may be somebody out there
waiting for this message to be handled...
At any point in the process, you can send a reply to the originator of
the message by using BMessage
::SendReply
. That reply might go to the
application, to a specified target, or might simply show up as the result
of a SendMessage()
call. It's important to make sure a reply gets sent, in
case somebody is waiting synchronously for a reply (i.e., the SendMessage()
synchronous send case).
If you do not reply to the message yourself using SendReply()
,
the BMessage
destructor will send a reply of its own (with a what
code of B_NO_REPLY
). Normally, the looper takes care of deleting the message itself,
so you needn't worry about whether a reply will be sent. However, if you
use DetachCurrentMessage()
, it becomes your responsibility to delete the
message yourself when you're done with it. If you don't do this, and you
don't explicitly send a reply, anybody waiting for your reply will never
get one!
Fin!
So there you have it—the entire BeOS messaging system, distilled and presented for your coding pleasure. Now, I realize that by showing you the nuts and bolts of BeOS messaging, I may have stripped some of the magic out of it for you. So, I encourage you to correct the balance by using this system to create magic of your own. Listen, the Vegas crowds are beckoning...