Be Newsletters - Volume 2: 1997

Issue 2-29, July 23, 1997

Be Engineering Insights: The Woes of Memory Allocation

By Dominic Giampaolo

If I had a dime for every time I've fixed a bug caused by memory corruption, I'd be a rich man. Memory corruption bugs come in many forms, sizes and shapes, and sometimes they even come disguised. I've seen blatant memory corruption bugs and some so subtle that it takes weeks to track them down. Fixing these kinds of bugs wears you down, makes you curse like a sailor and is known by the state of California to cause heart trouble among otherwise healthy young programmers.

Good tools enable you to track down memory corruption bugs more easily. Under the BeOS, enabling the first line of defense against memory corruption errors is as simple as typing the following line in the shell:

export MALLOC_DEBUG=1

Then run your application from that Terminal window. You can also enable this feature in the Metrowerks debugger by choosing "Launch with malloc() debugging" in the BeOS PPC panel of the debugger preferences dialog.

If your code is clean, running with MALLOC_DEBUG will do nothing. However, if you accidentally reuse free memory, overwrite the end of a piece of allocated memory, free something twice, or scribble before the beginning of a block, the debugger will pop up with an assertion informing you of what happened. Further, when MALLOC_DEBUG is on, all newly allocated memory will contain garbage values and when you free a piece of memory it is again filled with (different) garbage data. This level of protection catches the most common memory corruption problems and the associated stack crawl will almost invariably lead you to the culprit code.

I believe that every single BeOS application should endure some amount of testing under the effects of MALLOC_DEBUG. This will prevent you from shipping bugs that are easy to fix and will make for more robust applications. If an app crashes with MALLOC_DEBUG turned on the app needs fixing.

And let me preempt the usual objections here: If your app crashes with MALLOC_DEBUG on, it is not because of a bug in the malloc debugging code. Given that I wrote the malloc debugging code (and that it is rather simple code) I will stake my reputation on the fact that it is bug free and offer a reward if anyone finds a bug in it (reward details yet to be determined). We've used MALLOC_DEBUG a fair bit internally at Be and it has caught numerous bugs.

Some memory corruption bugs slip right through the safety net of MALLOC_DEBUG and cause problems. These bugs require more sophisticated techniques to catch. For example, suppose an app allocates 64 bytes of memory, uses it, and then frees it but accidentally keeps a pointer to the block around. If at some later point the app again allocates 64 bytes of memory, begins to use it and then the pointer from the first allocation is also used (in the context of the first allocation), the use of the pointer in the context of the first allocation will corrupt the data of the second memory allocation. And because the second allocation happened, there is nothing that the MALLOC_DEBUG code can do to catch the problem.

This style of bug is particularly difficult to track down because frequently the second memory allocation will be for the same type of object and the misuse of the first pointer usually over-writes the memory with similar values. Lest you think this example contrived, this exact bug arose while working on the BeOS disk buffer cache management routines.

The solution that will catch this problem is sometimes known as a "purgatory list." The basic concept is that when an app calls free(), the memory is filled with garbage but free() is not really called. Instead of being free()-ed, the memory is added to a "purgatory list" where it is held until some time in the future when free() is really called. Holding the memory in "purgatory" prevents the above problem because if the first pointer were reused, the memory would still be in the purgatory list and would likely cause a crash since it would contain garbage.

Implementing a purgatory list isn't particularly difficult but there are a few issues to be aware of. The first problem is of course how do you get your routine to be called on every free(). I normally add #define macros that redefine malloc, free, realloc, calloc, and strdup to call suitably renamed routines which you implement and use to call through to the real versions of malloc, free, etc.

If the #define macros are put in a commonly included header file for your project then all your code will call your new routines. These new routines must manage a list of allocated memory blocks and decide when to call free() for real. The most common solution is to have a circular buffer of allocated memory blocks and to only call free() for the oldest items in the circular buffer. Of course because this list is shared between multiple threads it must be semaphore (or even better, benaphore) protected.

The size of the circular buffer affects how long items stay on the list, which affects how much memory your program uses as well as whether or not an item will still be in "purgatory" when the bug happens. You must weigh the added memory usage against the likelihood that the bug will not show up. If you have a purgatory list of 128 items it's fairly likely to catch most bugs. Of course knowing details of the particular instance of the bug may allow you to only track allocations within a certain size range. In the degenerate case you can of course choose to never really call free(), just to see what effect it has on your program.

Although it would be convenient, MALLOC_DEBUG does not currently implement a purgatory list. There were several reasons for this, most importantly that a purgatory list requires a fair bit of internal state that I did not feel was appropriate to add to the standard C library. Plus, the added overhead becomes even more noticeable for most programs, which may not need it. It is still possible to add, however, and if there is enough clamor, it will likely happen (although enabling it would require setting MALLOC_DEBUG to a specific value).

Barring even more sophisticated techniques which require rewriting object code to maintain information about every memory reference (i.e. Purify), these techniques can help you catch a good number of memory corruption bugs. Using MALLOC_DEBUG can help you catch most of the common memory corruption bugs that people make. Sometimes however you have to pull out the big guns and implement a purgatory list to catch the more difficult bugs.

News From The Front

By William Adams

When I was in high school I ran track. I was a distance man, not one of those wimpy sprinter types who can only run for 100 meters and then have to stop. Running long distance takes a lot of stamina, perseverance, concentration, and of course strength.

Running a one mile or a two mile race is a very funny thing. You kind of start out slow and in a pack with the rest of the runners. As the laps wear on, you stratify and end up smaller packs, the winners and the non-winners. Our coach would sit at the last turn of the last lap, 200 meters from the finish, and when we came by he would shout "Kick it in!!". That was our signal to dig down into our reserves of energy and "kick" like mad for that last 100 meters to win the race...most of the time at least.

That's how I feel whilst programming as a new developer's conference approaches. You're busy creating demo code, answering questions, doing tutorials, and when you think you can't stay up until 4:00am one more time, that voice in your ear yells "Kick it in!!".

When I bought my first BeBox way back when, I had a dream of supporting some killer 3D libraries to do some nifty live 3D animation for presentations. I got the source for the 3Dfx glide library way back then and got it working in DR5/6/7.

Then I joined Be and got real busy. Along comes Geoff Woodcock who is like a puppy who knows how to program saying, "I want to do a driver, I want to do a driver!!" Well, being DTS it stands to reason that we should be able to do at least one of everything our system supports, so I kicked the disk over to him. He tore out what little hair he had left and...now we have the 3Dfx glide library running in the Preview Release!

For those of you that don't know, 3Dfx is one of those companies that manufactures 3D accelerator chips for the PC market. They are particularly interesting because they also supply chips to many arcade game consoles. They have arguably one of the best chip sets out there today for the consumer marketplace. And now this chip set is supported by the BeOS:

ftp://ftp.be.com/pub/dr9/samples/glide.zip

What can you do with it? The glide library itself is pretty raw. It does the initialization and setup of the board, and very basic drawing primitives. You are responsible for manipulating your geometries and the like, just like with OpenGL®. Speaking of which, now that we have this accelerator, can we have accelerated OpenGL®...?

Another thing that got me all hot and bothered about my BeBox was the infrared ports. I had visions of being able to control various home devices, and my various BeBox operations from a remote control. Well, due to certain events around here, I ended up doing some software to support a IR device that attaches to the serial port:

ftp://ftp.be.com/pub/dr9/samples/cs8130.zip

This is a program that allows you to manipulate the Crystal Semiconductor CS8130 IR chip. Like the BeBox, this thing can read and play back infrared signals. You could also, by the way, use it for IRDA as well for data transfers, but that's another story. The included software makes it pretty easy to train the device to recognize any remote, and then to subsequently wait for commands to come in and do something with them. Pretty neat stuff.

When the beloved BeBox was laid to rest, we lost MIDI, infrared, and the Geek Port. But now we've regained MIDI, infrared and maybe soon we'll get ADB/IO with the help of BeeHive (http://www.bzzzzzz.com/). Who knows, stranger things have happened, and our aforementioned puppy's hankering to do an ADB tutorial.

The Boston Be developers conference is just around the corner, and more importantly the great BeOS Masters' Awards computer give away is looming large. So in the words of Steve Boaz, KICK IT IN!!!

Newsletter Article Revision

By Jean-Louis Gassée

In an earlier version of this newsletter I mentioned our pending financing, primarily to keep all of you informed about our activities. A few of you misconstrued my message and kindly offered to invest. Our financing is being conducted strictly in accordance with federal and state securities laws and is available only to accredited investors carefully selected by our financial advisors. We are neither soliciting nor able to accommodate investment offers from any of you. Thank you very much for your support.