When we decided to move networking into the kernel for the networking rewrite, some kernel team gangstas dragged me down a dark alley and roughed me up a little. Realizing that I was becoming an honorary member of the kernel team, they had to jump me into the gang and lay down the law.
"Rule #1," said Ficus, "is NO C++ IN THE KERNEL. When you find yourself hankering to overload operators in kernel code, ABSTAIN, DUMMY!"
It may come as a surprise to some BeOS developers that in our wonderful, buzzword-enabled, OO-powered OS, there is not a single line of C++ code in the kernel. There are many reasons for this, none of which will appear in this article.
Many experienced C programmers have been writing object oriented code in C for years. This article is geared towards C++ programmers who haven't coded much in pure C since college but are interested in doing drivers or kernel modules, and at C programmers who just hadn't thought of this before.
The moral of the story is, "There's no reason to let lack of language syntactic sugar keep you from writing good, solid object oriented code."
The best way to illustrate OO programming in C is to jump right in to code. The sample code for this article is available at <ftp://ftp.be.com/pub/samples/languages/ObjectOrientedC.zip>
Consider a base class, and a subclass that inherits from the base class and includes both public and private data:
/* from class.h in the sample code */ /* * base class definition */ typedef struct myBaseClass { char *name
; intid
; struct myBaseClass *next
; } myBaseClass; /* * inherited class definition. Note inheritance through * aggregation of the base class as the first member * of the struct. */ typedef struct myInheritedClass { myBaseClassbase
; uint32public_data
; void *private_data
; void (*public_action
)(struct myInheritedClass *); void (*private_action
)(struct myInheritedClass *); } myInheritedClass;
These are declared in a public header file, which we'll call
class.h
. For
this example, class.h
contains the class declarations and
construction/destruction function declarations;
class.c
contains the
class implementation; and main.c
contains the code which uses the classes.
Several interesting things are going on here. Classes are obviously represented in C as structs. The classes can contain either data members, or member functions.
Inheritance is accomplished by simply aggregating the base class struct as the first member of the subclass.
Public data encapsulation is accomplished via simple aggregation of the data type or a reference to it. Private data encapsulation is accomplished by hiding the actual private data by referencing it via a void*, and providing member functions to access the actual data.
Since C has no scope resolution operator to allow direct implementation of class member functions, the member functions for the class (public action and private action) would need to be declared and defined as standard C functions (presumably static to the module they are in) and assigned to the member function pointers in the class.
Note that these class declarations describe abstract classes. Concrete subclasses are accessed through the interface specified by the abstract class declaration. This gives us polymorphism, the third part of the sacred OO trinity along with inheritance and encapsulation.
Since C has no constructors and destructors, an allocator function must
be provided for each concrete subclass type. For example, if you wanted
to make three concrete subclasses from the myInheritedClass
abstract
interface called "one", "two", and "three", the following functions would
be defined:
/* * allocator and deallocator prototypes */ extern myInheritedClass *new_one
(int); extern myInheritedClass *new_two
(int); extern myInheritedClass *new_three
(int);
Each of these functions would allocate the memory for the classes,
initialize all the data members of the classes (and their base classes),
allocate the private data members for each class and assign them to the
private data pointer, and assign function pointers to the member
functions. The new*
functions take an int argument here and assign the
value to a member variable, as we'll see, much like a C++ constructor
might take an int argument to initialize a data member if desired.
The corresponding delete functions
extern voiddelete_one
(myInheritedClass *); extern voiddelete_two
(myInheritedClass *); extern voiddelete_three
(myInheritedClass *);
would deallocate the space used by the classes and their private data.
For implementation, any functions relating to the internal state of any given class will be colocated in the .c file implementing that class. Any code which is a consumer of the class will only be able to use the class through its interface as specified in a header file. This helps insure encapsulation.
The new *
and delete *
functions would (of course) be implemented in the
same file as the specific subclass is implemented. One way to write one
of the new *
functions would be
/* * definition of the private data structures used by the * "one", "two", and "three" subclasses */ struct one_private_data { intdata
; }; [...] /* * implementation of the member functions used to access the * class' data */ static voidpublic_action
(struct myInheritedClass *mc
) {printf
("class %s, instance id %d\npublic data is %ld\n",mc
->base
.name
,mc
->base
.id
,mc
->public_data
); } static voidone_private_action
(struct myInheritedClass *mc
) { struct one_private_data *pd
= (struct one_private_data *)mc
->private_data
;printf
("private data is %d\n",pd
->data
); } [...] /* * templates used to simplify creation of instances of each * of the subclasses. alternatively, once could simply assign * values to each of the struct membersin the new * functions. */ static myInheritedClassone
= { { "one", 0, 0 }, 1, 0,public_action
,one_private_action
}; [...] /* * class instance allocators and deallocators */myInheritedClass
*new_one
(intid
) {myInheritedClass
*rc
= 0;rc
= (myInheritedClass
*)malloc
(sizeof
(myInheritedClass
)); if(rc
== 0) goto error;memcpy
(rc
, &one
,sizeof
(myInheritedClass
));rc
->base
.id
=id
;rc
->private_data
=malloc
(sizeof
(struct one_private_data)); if(rc
->private_data
== 0) goto error1; ((struct one_private_data *)rc
->private_data
)->data
=rand
(); returnrc
; error1:free
(rc
); error: return 0; }
Two interesting things here. First, a static global instance of the specific subclass is created with most of the fields initialized, particularly the member functions. This is then copied to create specific instances of the class in the new function, and instance-specific data members are then initialized there.
Next, the private encapsulation through data hiding via (void *) is illustrated by the allocation of the one private data structure and its assignment to the opaque (void *) private data pointer.
To wrap it up, remember the important point that these implementations
are in a file not directly referenced by code that uses these classes;
the only interface visible to them is the declarations of the class
structures (and the class' member functions), and the
new *
and delete *
functions. The sample code illustrates this. Here is how the classes are
used, in a separate source file:
/* from main.c */ #include "class.h" staticmyInheritedClass
*make_class
(intid
); /* * this function returns a random inherited class instance */ staticmyInheritedClass
*make_class
(intid
) {myInheritedClass
*rc
= 0; switch(rand
() % 3) { case 0:rc
=new_one
(id
); break; case 1:rc
=new_two
(id
); break; case 2:rc
=new_three
(id
); break; } return rc; } intmain
(intargc
, char **argv
) {myInheritedClass
*head
= 0, *tmp
= 0; inti
; /* * make a list of 5 random subclass instances. */ for(i
=0;i
<5;i
++) {tmp
=make_class
(i
); if(tmp
!= 0) {tmp
->base
.next
= (myBaseClass
*)head
;head
=tmp
; } } /* * iterate through the list and call the member * function to print out each subclasses public * and private data. Whee, polymorphism! */ for(tmp
=head
;tmp
!= 0;tmp
= (myInheritedClass
*)tmp
->base
.next
) {printf
("{\n");tmp
->public_action
(tmp
);tmp
->private_action
(tmp
);printf
("}\n\n"); } return 0; }
There you have it, object inheritance, encapsulation, and polymorphism in
C. In this example, inheritance is used by aggregating myBaseClass
as the
first member of myInheritedClass
; encapsulation is provided by having
both public and hidden data members, with the only way to get at the
private data types being member functions that know how to access them;
and polymorphism, which is provided by the struct definitions for the
classes acting as an abstract class interface from which concrete classes
are created.
One thing I've noticed lately in other people's code is the use, or
rather the lack of use, of the volatile
keyword.
Essentially, one should declare something volatile
if its value will be
changed asynchronously by code executing in parallel to other threads
using it, especially if the value is tested inside of loops and such. The
reason for this is that the compiler has no way of knowing that a
variable may be modified by something outside the codeblock it's
currently compiling and may do optimizations that involve code motion,
such as hoisting assignments or comparisons out of loops, or locally
caching the value of a variable once instead of checking the value stored
in its real memory location on each loop iteration. Declaring the
variable volatile
lets the compiler know that it isn't free to make these
kinds of optimizations with respect to that variable.
The classic example is for variables that are updated at interrupt time, such as driver registers. But the same is true of variables that may be changed in another thread. If your execution path depends on testing the value of a variable in a loop, and the value may be changed asynchronously by another thread, then you need to declare that variable volatile.
In a multithreaded system like BeOS, the most critical issue is the synchronization of simultaneous tasks. The BeOS kernel provides one synchronization tool, the semaphore, but not all synchronization problems are best addressed this way. In particular, software designed for other operating systems where semaphores are not the primary synchronization tool can be difficult to rewrite using other primitives.
In the Unix world, for example, the basic synchronization primitive is the "condition variable," or "CV" for short. Superficially, CVs behave similarly to semaphores. When a thread waits on the CV, it blocks until some other thread signals that CV. The first thread then awakens and continues executing.
Signalling a condition variable is analogous to releasing a semaphore. The difference between them is in what happens when they're signalled (or released) when no threads are waiting. A semaphore "remembers" the signal; a thread that later tries to acquire the semaphore will succeed immediately because of the earlier signal. In contrast, condition variables don't remember whether they were signalled: if no threads are waiting when the CV is signalled, the signal is "lost," and has no effect.
In this week's article I'll present a condition variable implementation for BeOS. The interface is similar to the POSIX threads API for familiarity and ease of porting. Building a CV implementation using only semaphores turns out to be quite complex, so I'll just explain the code's behavior here and leave the detailed analysis of its workings until next week. You can download the code and an example program that uses it at this URL:
<ftp://ftp.be.com/pub/samples/portability/condvar.zip>
Unfortunately, I don't have enough space here to discuss condition variable semantics in detail. I recommend a good POSIX threads book for this. My personal favorite is David Butenhof's Programming With POSIX Threads , but the O'Reilly Pthreads book is also good. I'm also trying to avoid typing too much because of a hand injury, so for now, I'll just summarize the interface and the example program.
Condition variables are tightly associated with a locking primitive called a "mutex," so the library provides a C interface to both. Here are the APIs for both:
/* mutex operations */ status_tbe_mutex_init
(be_mutex_t*mutex
, be_mutexattr_t*attr
); status_tbe_mutex_destroy
(be_mutex_t*mutex
); status_tbe_mutex_lock
(be_mutex_t*mutex
); status_tbe_mutex_unlock
(be_mutex_t*mutex
); /* condition variable operations */ status_tbe_cond_init
(be_cond_t*condvar
, be_condattr_t*attr
); status_tbe_cond_destroy
(be_cond_t*condvar
); status_tbe_cond_wait
(be_cond_t*condvar
, be_mutex_t*mutex
); status_tbe_cond_timedwait
(be_cond_t*condvar
, be_mutex_t*mutex
, bigtime_tabsolute_timeout
); status_tbe_cond_signal
(be_cond_t*condvar
); status_tbe_cond_broadcast
(be_cond_t*condvar
);
Mutex operations are a lot like BLocker
operations; I won't discuss them
much here, except to say that, unlike a semaphore, a mutex can only be
unlocked by the same thread that locked it. This provides a stronger
locking guarantee.
be_cond_wait()
and be_cond_timedwait()
are the operations for blocking on
the condition variable. They're identical except for the timeout
semantic: be_cond_wait()
blocks forever,
and be_cond_timedwait()
blocks
until the specified time. That time is *absolute*, not relative, and in
this implementation it uses the same time base as the kernel's
system_time()
function. If you want to block until 10 seconds into the future,
you pass (system_time() + 10000000
) as the timeout value. Note that there
is a mutex argument to these functions: that mutex *must* be locked when
the wait function is called, and the mutex is guaranteed to be locked
when the function returns, even if the function returns an error.
be_cond_signal()
awakens one waiting thread,
and be_cond_broadcast()
awakens *all* waiting threads. It's not necessary for any mutex to be
locked when either of these two signalling functions is called.
It's important that your code test the condition you're interested in before blocking on the condition variable, and then test it again after awakening. For example, in the sample Alarm app, the condition is "the alarm queue is not empty." When waiting on this condition, the code looks like this:
while (alarm_list
==NULL
) {be_cond_wait
(&alarm_cond
, &alarm_mutex
); }
This before-and-after testing is necessary because in certain rare circumstances the waiting thread may awaken without the condition variable being signalled. This is called a "spurious wakeup." The POSIX spec explicitly permits such behavior because preventing it can make CV implementations far less efficient.
The Alarm sample is a direct rewrite of Example 3.3.4 in the Butenhof book cited above. It repeatedly asks the user for an alarm time, expressed as a number of seconds into the future, and a message associated with that time. It then issues the message at that time. The code is heavily commented to explain how the condition variable is being used, and illustrates both regular and timed waiting. If you're new to condition variables, I recommend playing with the sample application, reading through the code, and thinking about how the implementation would be different if based on semaphores.
That's it for this article. Next week I'll look at the CV implementation itself, and explain why its unusual approach was necessary to guarantee the correct semantics.
After the fact, everything is much clearer and (almost) everyone is much wiser. The latest episode in the Palm saga is (almost) no exception to that rule and... rather than pile on more retroactive wisdom, I'd like to focus on little fractures, on amusing ironies below the surface of the story. Who knows, the tale might contain forward wisdom for us to use, if we act decisively, before it turns retroactive.
A caveat before I proceed: I'm a director of 3Com. As a result, an even larger grain of salt should be required when considering my opinion on Palm matters. All I have is an opinion—I don't deal in Truth.
Let's remember how Palm started. They built PDA software and licensed it to Casio and Sharp—following the approved business model. For the founders, Jeff Hawkins and Donna Dubinsky, this became a frustrating experience in the midst of The PDA Part I, First Blood. Go and Eo came out, as did Microsoft's hommage to the idea, PenWindows. Then we had General Magic, the Newton, and—I was going to forget—Momenta. Let's say about half a billion dollars "invested" in "refining the concept."
In that context, Donna and Jeff had a hard time dealing with their Japanese licensees, who were too cautious, too slow, and too unwilling to invest in new versions of the hardware platform. So Palm's founders went to their Board and committed blasphemy: "Let's take control of our future. We know where we want to go, so let's build hardware that implements our vision".
The answer was a very polite "Let's find a strategic partner." Or more plainly: "These guys are crazy, let's sell the company." And, indeed, US Robotics successfully adopted Palm. The hardware was manufactured and well distributed in a retail channel that US Robotics knew how to feed. The founders, though, still felt uncomfortable, misunderstood, and somewhat disrespected by the strong US Robotics culture—a feeling that appeared to be mutual.
When 3Com acquired US Robotics, Palm wasn't very visible yet and not much was made of that part of the acquisition at the time. Donna and Jeff had great hopes for the relationship with their new adoptive parent. Sales kept growing. The Palm Pilot was now quite visible and had become market leader.
Meanwhile, CE appeared and went through several revisions. Many felt that Palm had to avoid being "Appled," that is, left isolated on a proprietary island. Palm's management resisted the change, the move to a new business model, even at a time when addiction to hardware margins hadn't set in yet—mainly by "virtue" of their absence.
In the early days, Palm sales grew rapidly, but the product wasn't making money. This encouraged some to push for a bloodless transition. Palm's founders found themselves frustrated again and lobbied for their freedom, as a spin-off. They felt misunderstood and stifled by the larger company. In 1998, a spin-off would have destroyed the favorable tax treatment of the US Robotics acquisition, coming too soon after its June 1997 closing. Donna, Jeff, and Ed Colligan left—with a Palm OS license—and started Handspring, a PDA company.
A year later, 3Com feels it can spin off Palm without unfavorable tax consequences and can realize value for its shareholders. Handspring announces a Pilot clone with an interesting approach to hardware modularity. It looks exciting—I've registered on their Web site to be notified of the product's availability.
So, what do we have? A soon-to-be-independent company, prepped for an IPO, just as the founders wanted—but without them. A strong, independent, creative Pilot clone maker, something the founders didn't want—but from them. Was the Pilot successful because Palm took control of both hardware and software, or in spite of that move? Likewise, are Apple's current fortunes exemplary or nonsensical? Is the PDA ecosystem better off because the founders left to start a muscular venture-backed (Kleiner, Benchmark et al.) clone? Or would the ecosystem be better off with the founders running a spun-off Palm, but without Handspring? In sum, is the Palm saga exemplary or totally accidental?