Issue 2-28, July 16, 1997

Be Engineering Insights: Shared Librairies And Add-Ons

By Cyril Meurillon

When you write an application, you depend on the API that's provided by the OS. But how and where does your application find the code that corresponds to a particular function call? There are two vehicles for collecting "common" code that different applications can use: a) shared libraries and b) add-ons. This article looks at the features of and differences between shared libraries and add-ons.

Shared libraries are binaries that contain code and data that can be used by any application. For example, in /boot/beos/system/lib, you'll find the "system libraries," i.e., the libraries provided by Be and used by most applications. They contain the code for the standard kits, such as the Interface Kit, the Storage Kit, the Application Kit, and so on.

Developers can also create their own shared libraries. This can prove particularly useful if you're developing a line of applications that share some amount of code. The common code is put into a shared library that the "client" applications can link against. Some developers may even find a market in developing shared libraries that they can sell to other developers as API extensions.

Structurally, add-ons are identical to shared libraries. The difference between libraries and add-ons is how they're used:

Looking for and using add-ons is very useful if you want to allow "extensions" to your program.

Many BeOS applications rely on add-ons: The Tracker, for example, allows its add-ons to operate on the set of selected files through a defined API. The add-on only needs to have a function process_refs(), which is invoked on the selection passed as a BMessage. This architecture makes the Tracker customizable. Rraster uses the same technology to identify and parse different picture formats—the format recognition isn't in Rraster itself, but in the add-ons that it loads. Adding support for another picture format simply consists of writing an add-on that can decipher that format and complies with Rraster. Another use of add-ons is in the kernel with loadable drivers and file-systems.

A Shared Library Example

Launch the IDE and create a project squarer.proj that contains squarer.c as its only source file. Change the project settings the following way:

  • set the project type to "Shared Library"

  • set the file name to "squarer" (under "PPC project")

  • remove __start as the "Main" entry point (under "Linker")

  • set "use #pragma" (under "PEF")

You can then make the shared library.

squarer.c: (the shared library)

extern "C" int squarer(int);

#pragma export on
int  squarer(int x)
{
  return x*x;
}
#pragma export off

The #pragma primitive tells the linker which symbols to export from the shared library. Without it, the function squarer() would be invisible from the executable. Also, squarer() is declared as extern C to avoid C++ name mangling.

Now create another project for the application that's going to call squarer(). This one takes the default project settings. Add main.c and the shared library squarer to the project, and make your application.

main.c: (the executable that links against the shared library)

#include <stdio.h>

extern "C" int squarer(int);

int main(int argc, char **argv)
{
  int    n;

  n = squarer(5);
  printf("> squarer(5) = %d\n", n);
}

In the <app_dir> (i.e., the directory that the application lives in), create a directory named lib and copy "squarer" into it. The kernel loader automatically looks in <app_dir>/lib for shared libraries (in addition to looking in the system-defined library directories).

Launch the application from the shell, and you'll see the expected result

$ squarer
> squarer(5) = 25

"Very good," you think, "but je suis francais—I want to use her as an add-on!"

Add-On Example

Shared libraries and add-ons are built exactly the same way (remember, they're structurally identical). So we don't have to do anything different to build "squarer". "squarer" does not change, but the way the function squarer() is invoked does.

Here's the new version of main.c:

#include <stdio.h>
#include <image.h>

int main(int argc, char **argv)
{
  int n;
  int (*squarer)(int);
  image_id aoid;

  aoid = load_add_on("squarer");
  if (aoid < 0) {
    printf("problems loading the add-on\n");
    return 1;
  }
  if (get_image_symbol(aoid, "squarer",
        B_SYMBOL_TYPE_TEXT, &squarer)) {
    printf("problems finding symbol 'squarer'\n");
    return 1;
  }

  n = (*squarer)(5);
  printf("squarer(5) = %d\n", n);

  unload_add_on(aoid);
}

The code is pretty much self-explanatory. get_image_symbol() takes B_SYMBOL_TYPE_TEXT as a parameter to indicate the symbol is a function.

Copy squarer into a directory add-ons that you create in the app's directory—that's where load_add_on() will look for the add-on. Run the application, and you'll get the same result. But the important difference is that you can now load any add-on, and, as long as it respects our convention (i.e., if it defines a function int squarer(int)), you can use its services.

Library and Add-on Locations?

Where should you put your shared libraries and add-ons? When asked to load a shared library or an add-on, the system uses an environment variables in its search: LIBRARY_PATH (for libraries) and ADDON_PATH (for add-ons). Each is a list of paths using a colon (":") as a separator.

LIBRARY_PATH is set by default to:

%A/lib:/boot/home/config/lib:/boot/beos/system/lib

Where %A is means the app directory. You see now why we have put the shared library "squarer" in a directory named lib/.

As for which branch of the search path to put your library in, the convention is the following:

  • If the library is application specific, put it in %A/lib.

  • If it can be of some use to applications from other third parties, put it in /boot/home/config/lib.

  • DON'T put it in /boot/beos/system/lib. This directory is reserved for system libraries.

Note an interesting use of the %A/lib directory. If you ship some application on a CD that relies on a certain version of the system libraries (libroot.so and libbe.so for example), you can very well include those on the CD in that %A/lib directory. They will automatically override the libraries in /boot/beos/system/lib for your application, but not for other applications. This way, you can be sure that your application will run, no matter what version of the system software is currently installed on the user's machine, and it won't mess up the rest of the system. No more installation nightmares.

In a similar manner, ADDON_PATH is set by default to:

%A/add-ons:/boot/home/config/add-ons:/boot/beos/system/add-ons

The "branch" rules are the same as with libraries:

  • An application-specific add-on should be put in %A/add-ons.

  • An add-on that's intended to be shared by different applications would end up in /boot/home/config/add-ons.

  • /boot/beos/system/add-ons is reserved for system add-ons.

Also note that you can use symbolic links in combination with the %A/lib or %A/add-ons directories. %A refers to the directory of the real (resolved) application, not of the link. For example:

/boot/home/fred is a symbolic link to /boot/apps/myapps/fred; when /boot/home/fred is launched, %A refers to /boot/apps/myapps. This makes it possible to hide the lib/ and add-ons/ directories from the user.

Conclusion

Only your imagination limits what you can do with libraries and add-ons. We at Be have been using them quite a bit in interesting manners. But we count on you to get the best out of them and invent other powerful uses.


Be Engineering Insights: UTF-8 For The BeOS

By Hiroshi Lockheimer

By now I'm sure you've heard of UTF-8, the character encoding method of choice for the BeOS. (Take a look at the article "Unicode UTF-8" by Don Larkin, found in Issue 75 of the Be Newsletter if you are unfamiliar with UTF-8.) Given that text is something every developer deals with in their work in one form or another, I thought I would share some of my UTF-8-ish experiences with you, mostly in the form of tips, clarifications on common misconceptions, and some deep confessions of my own.

strlen(), Byte-length, and Character-length

This is probably the first question that crosses a developer's mind, even if you're developing in FORTRAN: how does strlen() work with UTF-8 text? Well, it's simple, strlen() counts the number of bytes in a string until it encounters a null-terminator (a byte with a value of 0). Since UTF-8 is backwards compatible with plain old ASCII, a null-character is still a null-character is still a null-character. So, that's that: strlen() will still work as expected with UTF-8. That is, it will return the number of bytes, not the number of characters (or glyphs) in your string.

In a multibyte encoding method such as UTF-8, a byte-count is different from a count of characters (which, incidentally, is why the function BTextView::SetMaxChars() was renamed to BTextView::SetMaxBytes()). This might seem strange (or even bad) at first, but it's actually a good thing. After all, malloc() could care less how many instances of Japanese characters there are in a string when you're trying to allocate some memory into which to copy it.

What if you want to know how many bytes a character is, so that the number of characters in a string can be calculated? First of all, you should consider carefully whether it is indeed necessary for you to know this information. Again, functions such as strchr(), strcpy(), strlen(), and strstr() work as-is with UTF-8. (This is true not only for the BeOS, but for UTF-8 text processing in general.)

Byte-lengths of characters are an issue only if there is potential for something to clobber portions of a multibyte character. For example, a text engine needs to be aware of a character's length so that it knows how many bytes to traverse when the user moves the insertion point.

OK, you're convinced that you need to know how to measure a character's length. One approach is to simply iterate through the bytes, counting from one initial byte to another. Another more exciting approach, however, is to use Pierre's Uber-inline:

inline uint32
utf8_char_len(uchar byte)
{
  return (((0xE5000000 >> ((byte >> 3) & 0x1E)) & 3) + 1);
}

When given an initial UTF-8 byte, the above inline will tell you the number of bytes (from 1 to 4) in the character that the initial byte represents. It's a hairy inline, and, to be honest (here's the confession part), I'm not so sure I could explain how it works...it just does. Keep in mind that the inline looks only at the initial byte of the character, and therefore does not verify that the following bytes actually make any sense when construed as a UTF-8 character.

So, with the help of Pierre's inline, you could count the number of characters in a string as follows:

uint32 numChars = 0;
for (uint32 i = 0; string[i] != '\0'; numChars++)
  i += utf8_char_len(string[i]);

tolower() and Friends

A similar but more subtle issue arises with functions such as tolower(). Many (if not all) of the toXXX()/isXXX() functions in ctype.h are not UTF-8 aware. They will fail or even munge your UTF-8 data, so beware. The proper implementations of those functions require the use of carefully crafted mapping tables. A less accurate, but often sufficient, implementation is to use those functions only on 7-bit ASCII data. Something like:

inline uchar
utf8_safe_tolower(uchar byte)
{
  return ((byte < 0x80) ? tolower(byte) : byte);
}

UTF8.h

Take a look at the Support Kit when you receive your copy of the Preview Release. There are two UTF-8 conversion routines in a new header file called (surprise!) UTF8.h. convert_to_utf8() and convert_from_utf8() are already documented in The Be Book, look there for more details. Also, if you simply want to convert some non-UTF-8 files, try out a little tool called xtou in /bin. Its usage is pretty straight-forward; do the following in a shell if you want to convert your ISO 8859-1 document to UTF-8:

$ xtou -f iso1 my_iso1_file

You can optionally specify the -n option to convert carriage-returns to newlines (useful for Mac files).

It Pays to Cache

There is an inherent slowness to synchronous server calls because of the associated messaging (client to server, then back to client) overhead. Since BFont::StringWidth() (and the BView equivalent) relies on the App_Server for font metrics information, repeated calls to StringWidth() can be costly. The solution is to cache this information wherever sensible. For example, if you have an object that draws a line under some text, it might make sense for you to cache the string's width so as to avoid the gratuitous calling of StringWidth(). Ming tells me that the price of a float at Fry's is ridiculously cheap these days, so don't be too shy about equipping your classes with another member variable or two.

If you're writing, say, a text engine or a web browser (read: something string-width intensive), simply caching a single string's width might not be what you want. You probably want to be able to find out any string's width without incurring the messaging overhead each time. Sounds like you want your own StringWidth() function.

I happened to have such a "width buffer" object, and did some profiling with it. BStopWatch revealed the potential performance boost of a local string-width mechanism to be significant. On average, it took a version of NetPositive that did no caching roughly 530 microseconds to calculate the various strings (one at a time) in its default page. With caching enabled, the number went down to 54 microseconds.

I'll save the implementation details of my width buffer class for another article, but here are some things to remember if you decide to write your own.

  • Escapements rule. As long as you don't rely on B_STRING_SPACING mode, you can store the escapements of the individual characters on a per font style (and potentially size) basis, and then use those values to calculate the pixel width of any string. As usual, The Be Book has much information on this topic, as does Pierre's recent two-part article, "The New Font Engine, PART 1 and 2" (issues 76 and 77 of the Be Newsletter).

  • Don't assume 256. Remember, we use UTF-8. Creating a statically sized table of 256 items is not sufficient. Neither is pre-allocating a table for the entire Unicode code space. I implemented my class using a simple linear probing hash table.

  • Keep App_Server calls to a minimum. The whole point of caching things locally is to avoid the number of actual App_Server calls. For example, BFont::GetEscapements() allows for multiple characters to be measured per call. Every call counts.

That's it! Hope you get to know and love UTF-8 as much as Ron and I do.


News From The Front

By William Adams

Isn't it funny how humans are prone to analogy. For some reason it helps to say "It tastes like chicken" in order to relate the flavor of new foods to that of known foods. It's kind of like using a club to simulate the use of a hammer. So what analogy can be used to describe a new OS, its attendant release, and its expected acceptance in a new community? It's kind of like giving water to someone who's thirsty in the middle of the desert. It's like a dragster on steroids that only weighs 20 pounds. It's kind of like the day I went to junior high and discovered the opposite sex.

We've completed the Preview Release, and it has begun its irrevocable trip into history. Disks are pressed, packaged, sealed, and soon to be delivered, but wait, there's a typo...

You wouldn't believe what a relief it is to get something like this out the door. It's kind of like passing a stone, or doing yoga. It hurts while you're doing it, but you're so much better off when you stop. When you finally get the Preview Release in your hands, a few things in your life will change.

You will no longer need to gunzip .tgz files, because we will pack all of our things using zip. You will be able to send embedded forms based mail to us using a nifty third-party mail program. Your applications will stand a strong chance of being binary compatible with future releases. Your applications will have an audience to play to numbering in the few hundreds of thousands. Your will come into a lot of money and live a long prosperous life.

But wait, before you go, I quietly released a new version of the PCIList application

ftp://ftp.be.com/pub/dr9/samples/pciviewer.tgz

It fixes one particular bug whereby if you had multiple PCI cards of the same variety, only the first one was reported in the list, and the size of the registers was reported as kilobytes, but the number was actually HEX kilobytes.

What good is that app again? It's kind of the peephole at the construction site. You can kind of see what's in your system, but you can't touch anything. When you're trying to write PCI device drivers, it becomes useful.

What a weekend!

I was torturing my BeBox trying to do wicked things with television input and just generally tempting fate with my disk drive when I came across this thought, "what's a good analogy for how different it is to program other OSes compared to the BeOS." Then, conveniently, I went kayaking.

When you're kayaking, there is a subtlety of style that makes the paddling a lot easier. Your sitting flat on your bottom, and you simply twist your torso left to right, dipping you paddle in on either side and giving a stroke before lifting it out again. If you force it and try to do a Hawaii 5-0 type of brute force stroking, you might go faster for a short amount of time, but in the end, you'll just get tired and not really have a good time, you'll end up sore, and you won't go back in the water for a long time.

The BeOS is subtle and elegant. Whereas others have come up with designs such as OpenDoc, we have Replicants. Where others have come up with chants at bull fights, we have the BMessage object and SendMessage. Our strokes are supple, smooth and light. I don't know about you, but I prefer the long game.

Geoff Woodcock had an incident at a local restaurant recently that's just too funny to pass by, but I'll save it for next time. But to give you a hint, it's kind of like the ultimate in embarrassing moments.

Go forth and code! The Preview Release approacheth and the time has come to stand and deliver. I'm doing it, your friends are doing it, and soon the whole world be doing it!!


I Like Apple So Much I Want Two of Them

By Jean-Louis Gassée

Charles De Gaulle, while not known as a jester, had a cruel wit. A connoisseur of Germany, he obliquely paid his respects to the great country by saying he loved her so much he wanted two Germanys. That was cold war humor. The heads of state in Cupertino may be feeling more heat than cold, but, as you will see, we have a vested interest in a more successful cure to Apple's problems.

Let's focus on the problem du jour: Finding a new CEO for Apple. The latest coup is business as usual in the Hall of Mirrors. Think back 14 years:

Da capo? Another spin around the dance floor, Nellie? Unhappily, not quite. The patterns emerge and the names are the same, but the orchestra is packing up and the chairs are on the tables.

When Jobs hired Sculley, the company was on the way up. Today, it's losing customers, developers, money, and market share. (And Bill '95 isn't just a greenhorn angling for a license to the Mac look-and-feel.) Who wants to run Apple under these conditions—where's the Lee Iacocca of the personal computer industry? Simply wanting the job (so the joke goes) disqualifies you. The Jim Barksdales are too happy, too rich and, some say, too smart to consider it.

Apple doesn't need a CEO, they need a messiah (or a crash test dummy). And any problem that requires walking on water as a solution is, you'll grant me, a problem ill-stated. Still, there may be a way to make the search for a new CEO easier. That's where General De Gaulle comes in...

Split Apple into two companies, one for hardware, the other for system software and applications:

I realize this involves convincing shareholders, perhaps with terms involving a temporary reduction of their stake in order to attract new investors or new lenders. But taking this new course will generate more excitement (if less blood lust) than backing up the car and running it into the wall again with a new driver.

Instead of requiring superhuman skills of its new CEOs, the restructured Apple merely requires skill, courage, and hard work. And we have examples of these already in the industry. Does the hardware company sounds like Power Computing? Then merge it with Power or hire Steve Khang. On the software side, Guerrino De Luca was happily and successfully running Claris before valiantly signing up for the only job more dangerous than the CEO position at Apple: VP of Marketing. Make Guerrino CEO of "AppleSoft" and watch more cloners come back into the Mac space, thus reversing the market share slide.

Our interest in this is fairly obvious. We've always hoped for a level playing field for all Mac hardware manufacturers. Disentangling Apple's hardware from its system software would make everyone's life—ours included—much easier. In theory, CHRP offers an industry standard platform, the PC/AT for the PowerPC. But in practice, Apple appears to want to keep an "enhanced" version of CHRP for itself, in order to "de-clone" some combination of hardware and software features. Apple has tried to have its cloners and eat their lunch, too. It's not working; nobody's happy. This does not sound like a level playing field.

What other choices does Apple have? Many observers think an acquisition is likely; they even diagnose the recent upswing in AAPL as the result of traders betting on a sale. One Silicon Valley wag even speculates that Apple's Board of Directors gave Steve Jobs an expanded role in anticipation of the deed. After all, who better than Steve to charm a prospect into buying Apple at an interesting price?

Fair or not, such witticisms serve to emphasize the sentiment that Apple's current business formula must be revised, or else. We all hope -- for sentimental as well as business reasons—that Apple(s) will emerge from the current crisis and regain the health and the vibrancy that made it such a unique icon of Silicon Valley technology and creativity.


BeDevTalk Summary

BeDevTalk is an unmonitored discussion group in which technical information is shared by Be developers and interested parties. In this column, we summarize some of the active threads, listed by their subject lines as they appear, verbatim, in the mail.

To subscribe to BeDevTalk, visit the mailing list page on our web site: http://www.be.com/aboutbe/mailinglists.html.

WEEK 2

Subject: Dumb sound question

It was suggested, last week, that peak collisions (when mixing sound sources) are rare. This week, James McCartney pointed out that this isn't necessarily so (or not necessarily not so)...

On the contrary, if you have two signals of different frequency then their peaks are guaranteed to coincide at regular intervals.

Other suggestions: The Media Kit should provide a graph of subscribers, rather than a simple chain. Also, it would be nice if the "native" sound stream format were floating-point. Mixing floating-point samples and then converting to 16-bit integers at the end of the stream is not only "saner" (in that it's much easier to handle overflow—including deferred handling), the floating-point multiply/add is also faster than the integer multiply/add. Simple filtering and other manipulations (AM, for example) is, therefore, faster in floating-point. Of course, you pay the price in increased data.

THE BE LINE: At the DAC stream level, the subscribers need to speak in a format that's understood by the hardware, so don't look for mixing or floating-point-friendly features at that level. But a friendlier, higher-level graph is being considered as an improvement to the Media Kit.

Subject: Detecting Double-clicks...

Should an OS be expected to detect and report double-clicks? If so, how would an app express the criteria for judging two clicks as part of a double, as opposed to being two separate events? Obviously, some cases defy OS-ification: A view that displays multiple clickable items has to parse clicks on its own, for example.

It was generally agreed that a simple formula such as "two clicks within a certain (small) amount of time, regardless of mouse's location => a double-click" is wrong. The formula is better if you throw in some location constraint ("...within a certain amount of time, and within the same 'item'..."), but it's still not going to cover every case.

It was somewhat agreed that certain Be objects (list view items, for example) can/should implement their own double-click testing.

NEW

Subject: So what would smart sound look like?

This thread, which branched off of the "dumb sound" thread, discussed sound recording and processing in general (i.e., without special regard for the BeOS): Are delay buffers necessary? Are they expensive?

In a multi-source recording or playback what entity assigns/synchronizes timestamps? Is 44.1 kHz floating-point a reasonable trade-off between fidelity and data width? What happened to a-law and mu-law? And so on.

Subject: Threads and Fork (again)

What's the real story on spawn_thread() and fork()? Are they incompatible (as the Be Book claims), or can you mix the two?

THE BE LINE: As Dominic Giampaolo explained, while it *is* possible to fork() and then call spawn_thread(), it's not a good idea. The Be Book is perhaps a bit extreme in its estimation of the consequences of such an act, but it's correct in its proscription.

Subject: Tracking the mouse inside a view

What's the best way to track the mouse while the user is moving it around inside a view? Although you certainly don't want to lock down the entire window, you probably want to be able to generate some sort of feedback. This thread discussed a couple of ways to safely watch the mouse and still be able to draw.

Creative Commons License
Legal Notice
This work is licensed under a Creative Commons Attribution-Non commercial-No Derivative Works 3.0 License.