Issue 2-9, March 5, 1997

Be Engineering Insights: Will Your DATA Look Like ATAD?

By Bradley Taylor

What the heck is this guy talking about? Well, if you've ever transported data between little-endian (for example, x86) and big-endian (for example, PPC) processors, then you know exactly what I'm talking about. Processor designers make choices when deciding how data will be stored, and compiler writers add some of their own twists. It's likely that data written on one platform can't be read correctly on another. If you're aware of this and take some precautionary steps, you'll be safe when it comes time to read your data on some other platform besides a PowerPC version of the BeOS.

If you're writing a network application, pay attention. Even if you're merely saving data to a file, you're unwittingly writing a network application. Files are easily transported across a network and often read in the future by applications on other platforms.

First, some background. Most programmers are familiar with the byte-order differences among processors. If you aren't, consider this code:

int  i = 0x01020304;
char *c = (char *)&i;
printf("%02x %02x %02x %02x—%02x %02x %02x %02x\n",
    c[0], c[1], c[2], c[3],
    c[3], c[2], c[1], c[0]);

On a little-endian processor, such as the x86 (or Pentium), the output is:

04 03 02 01—01 02 03 04

On a big-endian processor, such as the PowerPC (in the mode that the BeOS uses) you get:

01 02 03 04—04 03 02 01

Little-endian means that the first byte contains the least significant 8 bits. Big-endian means that the first byte contains the most significant 8 bits. If you've never seen little-endian machines in action, you might think they're doing things backwards. But they're no more backwards than driving in Britain is backwards or the flushing of toilets in the southern hemisphere is backwards. It's just different, and takes a little getting used to if you aren't already familiar with it. The PowerPC can actually run either little-endian or big-endian. Be chose the big-endian mode for the BeOS (Windows NT uses the little-endian mode).

16-bit integers (shorts) and 64-bit integers (long longs, available with some compilers) are treated by processors in an analogous manner. In other words, the big-endian and little-endian versions are simply byte-for-byte mirror images of each other. Floats and doubles are treated as if they were 32-bit or 64-bit integers, with respect to endian- ness issues. Bitfields are generally not portable, and should be avoided when writing data externally. Of course, ANSI C says nothing about what sizes the various types are, and they can vary from platform to platform. The sizes I've assumed here for the various C and C++ types are the ones that are used on most modern platforms, including the PowerPC versions of the BeOS. However, to be safe, it's better to use typedefs than the built-in C++ data types to force data items to be a given size on all platforms. In DR9, Be will provide typedefs like "int32" to make this job easier.

Now, some more background, this time having to do with alignment. Many processors can't read a data item from just any location in memory, but require it to be "aligned." For example, the following code WON'T work on a PowerPC processor:

char *c = malloc(9);
double *d = (double *)&c[1];

*d = 0;  /* causes an exception on ppc */

That's because malloc will return a pointer that's "aligned" on an 8-byte boundary (the pointer is a multiple of 8), and the code is attempting to dereference a double on an odd boundary (not a multiple of 8). That same code, however, WILL work on an x86 processor (though it will run faster on the x86 if you align it).

Because of this, compiler writers usually lay out your data in a fashion that will both work with the processor and perform well. Some compilers have flags to "pack" data, so the data structures are smaller at the expense of running slower. Hence the rules for alignment vary from machine to machine, even on the same machine with the same compiler (but different compiler flags).

OK, I hope now you understand the magnitude of the problem. What's the solution, you ask? For alignment, the answer's quite simple. The concept of "natural alignment" means you lay out data in a way that's "natural" for the size of the type. Shorts should be aligned on 2-byte boundaries, integers and floats on 4-byte boundaries, and long longs and doubles on 8-byte boundaries. This works for all processors and compilers. For example, consider this very unportable structure:

struct {
  char c0;
  double d;
  char c1;
  int i;
  char c3;
  short s;
} foo;

This structure and its elements will be aligned all kinds of different ways with different processors and compilers. The compiler will insert padding into the structure as necessary to cause it to be aligned on the target processor.

Without reordering the data elements, you can force natural alignment by inserting your own padding into the structure:

struct {
  char c0;
  char _pad0[7];    // align to 8-byte boundary
  double d;
  char c1;
  char _pad1[3];    // align to 4-byte boundary
  int i;
  char c2;
  char _pad2;      // align to 2-byte boundary
  short s;
} foo;

Of course, if you reorder things you can get a more efficient structure without sacrificing natural alignment:

struct {
  short s;
  char __pad[2];    // force to 4-byte boundary
  int i;
  double d;      // already at 8-byte boundary
} foo;

With natural alignment the alignment of a structure, versus a scalar, should be the same as the alignment of the most restrictive element it contains. For example, consider this structure:

struct {
  char c0;
  struct {
    double d;
  } s;
  char c1;
} foo;

The most restrictive element is the double, so both structures foo.s and foo itself should be aligned on double boundaries. To make this data portable, massage it as follows:

struct {
  char c0;
  char _pad0[7]; // pad out to double boundary
  struct {
    double d;
  } s;
  char c;
  char _pad1[7]; // pad out to double boundary
} foo;

If the most restrictive element were an integer, rather than a double, then you'd only need 3 padding bytes instead of 7. Notice the padding at the end of the structure. Another rule of natural alignment is that structures should be sized to by a multiple of their most restrictive element.

The last thing I want to say about natural alignment is that there's no "natural" alignment for pointers, since pointers can vary in size. They're 32 bits on PowerPC and Intel processors, but on a DEC Alpha they're 64 bits. It doesn't make any sense to transport a pointer to another machine anyway, but you might think you can get away with it if you only use the value internally to your application. You can't, so leave the pointers out please. You'll be thankful when it comes time to read the data back on a 64-bit machine.

OK, so that does it for alignment. The harder problem is dealing with endian-ness. There are several techniques. The simplest way is to just set a bit somewhere indicating what byte-order (little-endian or big-endian) your data was written out in. Then when you read the data back, you simply check the bit to see if it matches the endian-ness of your current environment. If it matches, you do nothing. If it doesn't, then you swap. However, if you haven't written your little-endian application yet, then you can delay writing the swapping code. Just put a panic statement in there as a placeholder to remind you when you finally do port your application. If you want to write the swapping code now, you can take advantage of some handy routines that Be provides: read_16_swap(), write_16_swap(), read_32_swap(), write_32_swap(). The read_64_swap() and write_64_swap() routines are left as an exercise for the reader (hee, hee). Consult the on-line documentation for more information about the swapping functions.

Another way to deal with endian-ness is to always encode things in some canonical format (usually big-endian). This isn't an ideal way to do things, because machines with the same byte-order communicating with each other will swap things unnecessarily with some loss (usually very minimal) in performance. If you want to use this technique, you can use the Berkeley networking functions htons(), htonl(), ntohs(), and ntohl(), which convert shorts and longs back and forth between native order ("host" order) and big-endian order ("network" order). On big-endian platforms, such as the BeOS, the functions do nothing; on little-endian platforms, such as Windows NT, they swap the data. These functions are available on any system with Berkeley sockets, and your code should be portable to many platforms if you use them.

Along the same lines, Sun's XDR (External Data Representation) will encode and decode your data to and from big-endian format, plus deal with alignment issues. Using their stub compiler (rpcgen), it will even write all of the XDR swapping code for you (really!). It isn't the most efficient way to do things, but it does the job flawlessly and can keep you from pulling your hair out if you're dealing with fairly complex data. The package is free and widely available. For example, you can get it in both source and binary form in many Linux distributions.

We here at Be feel your pain. We're working on some tools to make your life easier with respect to machine differences. Look for them in the coming months. Until then, you can use the information in this article. !kcul dooG, er, I mean, good luck!


Be Engineering Insights: Paper Or Plastic?

By Doug Fulton

A couple of days ago, I stepped up to bat at Safeway, assumed my part as the novice in the call and response (C: "hello" R: "hello" C: "how are you tonight" R: "oh, OK" C: "that's good" R: ""), and then, rather than continue this sham of an acquaintanceship, I avoided eye contact with the checkeress and pretended to compute differential equations in my head. As she pronounced my total, I noticed that she was eating an apple. (No, wait. Let's make it some grapes.) Not her grapes—MY grapes, which I was about to pay for. I gave her a look that requested an explanation; she stopped a grape halfway to her open maw and held it up to better gander at this bit of fruit that had, by weight of her expression, stuck itself to her hand when she wasn't looking. Then she threw it below her out of sight with a pitch of disgust that I couldn't help but take personally. Not an endearing opening parry, but it broke the ice. Words, not unfriendly, were spoken.

(By the way, why are grocery stores so proud of having drug- free employees? Really—how much intelligence or presence of mind does it take run a bar code over a scanner? Grocery checking isn't exactly a going-somewhere career, the pay is lousy, and I can't imagine that anyone with an ounce of self- respect actually WANTS to be declared employee of the month, plum parking spot aside. We might as well let the poor grubbers relieve the tedium a bit. It's almost civic-minded: If you're going to get looped, better to be at the console of an NEC than behind the wheel of a stolen BMW. Some savvy supermarket could even take up a slogan of cynical goodwill: "Pik'N'Pay—Keeping the Stoners Off the Street." I don't care, I want my clerks happy—as long as they don't get hungry while they're ringing me up.)

Courtesies and histories were exchanged, and, half a conversation later, when I told her I worked at Be, she began to ask about the new File System API. The POSIX calls she knew; but the C++ API... she'd heard it was screwy. So I explained the class layout to her satisfaction:

The screwiness that my grape girl complained of centered on the BEntry/BNode duality—why have two classes that "mean" the same thing (that is, a file)? Because (I answered) in POSIX (from which all blessings flow) a "file" can mean a "plain" file, a symbolic link, or a directory.

Let's back up a little more: In DR9, a file (whether plain, link, or directory) is uniquely and persistently identified by an "entry ref." Entry refs are the common currency of file identification: If you want to tell another app about a file, you pass it an entry ref. When the user drops file icons on your app, the files show up in your app as a set of entry refs.

Getting back to the duality business: Let's say someone has just dropped an entry ref onto your app. You want to construct a BFile, or a BSymLink, or a BDirectory object to represent the ref. But you can't tell, from the ref itself, which class to instantiate. You have to create a BEntry and ask the object (through BStatable functions, actually, but don't let's start) what sort of file it is. You can then construct an object of the appropriate BNode subclass.

Is this screwy? No screwier than stat()'ing a pathname and then calling one of open(), readdir(), or readlink().

Of course, this didn't tell the entire story, but my little checkerella was starting to fog over, the folks behind me in line were starting to smell, and, anyway, I had a wife waiting at home and two kids locked in the trunk of my car (you can't be too careful). Having, in the meantime, cantilevered my produce into a replica of the Brandenburg Gate, the bag boy asked, "What'll it be, sir, POSIX or C++?"


News From The Front

By William Adams

As I sit here, late with my newsletter article, I have plenty of time to contemplate the nature of the Universe, where Be is headed, and whether or not Godzilla is real.

I want to make a couple of final offerings to appease the /boot/apps gods. So this week we have the source for a couple of the final holdouts.

Font Demo: This one's been a staple of our demos for the past year. It demonstrates our drawing speed and how to do a few UI type things. It's not too exciting, but those who haven't delved very deeply into our font code might find this useful.

ftp://ftp.be.com/pub/Samples/fontdemo.tgz

Font Chart: This one probably doesn't get much play, but it's equally useful for those who are wanting to delve into the bowels of fontdom. You can use Font Chart to select a font and display all its characters. It also does some nice funky things with keyboard events.

ftp://ftp.be.com/pub/Samples/fontchart.tgz

With these two programs, this concludes our release of sample apps from the /boot/apps directory. It's taken a few months and quite a few newsletter articles, but everything that was fit for publishing has been published.

These will also serve as the basis for conversion to DR9. That is, we'll try to refer back to these source bases to give you a leg up on how to get to DR9 when it's available.

Speaking of DR9, the offer of the porting lab is still open. So send in your name to DevServices@be.com and express your interest in participating in our porting lab. The date is still middle to late March, which still isn't solid, but gives you an idea as to what our schedule is.

We should have a lot of fun during this porting process, and the end result will be your products running on DR9 real soon. Again, you'll only really want to participate if you're trying to ship an actual product that's time critical with the release of DR9.

Communicating

When I first joined Be way back when, there was this BeDevTalk mailing list, which the community used to communicate with each other and with Be. The list at the time dumped 100 e-mail messages into my mailbox every day. That's lot of mail to go through if you're on the receiving end. Since that time I've had to take responsibility for mail coming to DevSupport, DevServices, and CustSupport. Another 100 messages a day. Not wanting to be a slave to my desk all day reading mail, we've hired additional people to take care of some of this. Brian Mikol is a new hire who's slowly learning the ropes and dealing with the DevServices and CustSupport mail. Brian's a smart guy and will come up to speed rapidly —then we'll dump more in his lap.

The BeDevTalk mailing list is still useful, and at times becomes quite rancorous. There are now more avenues available to developers who want to communicate with the Be community. The comp.sys.be news group has been split into quite a few groups, including comp.sys.be.announce, advocacy, programming, and the like. In order to lighten the load on BeDevTalk, some people might find it convenient to direct their comments to one of those public forums. It would be a tremendous boost to all developers if the BeDevTalk list were limited to discussions that are technical in nature, with a high signal to noise ratio. Advocacy, ranting, and other list-bulging discussions can go elsewhere. This will help us remain responsive to your true needs when they do arise from the list.

Also, there are a couple of efforts underway to index this mailing list and provide searchable content through the Web. This should make this list more approachable and useful to everyone.

DR9 is coming inevitably closer to completion and release. There's a lot promised and hinted at in this release. Here at Be we try to underpromise and overdeliver, but when there's a long time between releases, things can get a bit hyped up. We hope that DR9 will live up to our publicly stated promises, and maybe even offer some pleasant surprises. As it gets closer, you might notice our engineers becoming more silent. When this happens, don't fret, it's a sure sign that you're about to get a new release.


The Heat on the Clones

By Jean-Louis Gassée

It looks like Apple isn't so sure it likes the Mac clones anymore. This is neither surprising nor innocuous. It could end up doing irreparable harm to Apple and to the industry in general. Hopefully, Apple's recent actions have been misunderstood, the concerns of Mac cloners are overstated, and—to be positive and proactive—there's a profitable and friendly way out of Apple's dilemma. We've heard about Apple Australia's alleged saber-rattling at Apple dealers who also sell Mac clones. The written document was promptly leaked to the media. We saw the news of a seminar in Oregon touting the advantages of Apple's hardware against the products of Power Computing, Motorola, UMAX, DayStar, and others. The licensees paying money to Apple must find the idea interesting. Perhaps they're planning a seminar across the street where they'll extol their flexibility, price, marketing, or quality. Just kidding, I think. We learned Apple doesn't intend to license PowerBook designs, much to the chagrin of cloners and customers. We heard questions regarding Apple's real commitment to deliver a CHRP version of the MacOS. And there are rumors Apple will demand much higher license fees for the new hardware designs, several hundred dollars we're told by several sources. The alleged reason is the need to recoup hardware engineering costs at Apple. Privately, Apple executives express irritation at the clones' "cherrypicking" and claim Apple's making all the investment in the platform and the cloners reap all the profit. Let's hope these are false rumors or merely reflect a temporary loss of composure.

As for "cherrypicking," the cloners exhibit unpredictable behavior: Looking for the highest profit. As for making all the effort and the cloners making money out of Apple's investment, one thought that's what licensing the platform was about. We sell you a license. We both expect to make more money out of the deal than we put in. The cloner's margins are supposed to more than pay for the fee. And Apple is supposed to make real money out of the licensing business. Well, it seems that's where the stated assumptions, or the implementation, went wrong.

Regarding the stated assumptions, Apple went into the licensing business for reasons others than directly making money from it. At the time, a little more than two years ago, Apple's management felt the Macintosh platform needed more credibility. The Mac was labeled as proprietary, no alternative source of supply, no competition. With Mac clones, the reasoning went, software developers would invest more because the cloning would grow the Mac ecological niche, and corporate purchasers would feel better, safer, about Apple and the Macintosh. Then, as the cloners came on line, Apple's business and credibility started suffering for reasons unrelated to the presence of alternative sources of Power Mac hardware. As a result, these successful players in the Mac niche only exacerbated Apple's financial problems. Instead of a positive-sum game where everyone benefited from a growing PowerPC industry segment, the sum of the game now appears to be zero and the cloners' profits seem to be sucked from Apple's income statement. "Seem" is the verb I just used because we don't know. Where would the Power Mac segment be without the pioneering, audacious marketing of Power Computing, without players like UMAX, MP systems from DayStar, or the credibility of Motorola targeting business users?

The temptation might be to return to the old days where Apple ruled alone, or to increase licensing fees "as needed." The first would durably damage Apple's credibility at a critical time. So would the latter, by discouraging cloners.

Yet there's a way out of this situation, a way that would maintain a healthy growing Mac-compatible segment and preserve Apple's freedom to make both hardware and software for as long as their shareholders want them to. Right now, a cloner must license both hardware and software. That's the rub. Apple needs to let go of the hardware licensing business, establish a basic hardware design, the equivalent of the PC/AT at the core of all PC clones—let's call it the PPC/AT—and supply software that runs on that platform—no ifs, no buts. I wrote supply, not give. OEM licenses or shrink-wrap licenses are available at competitive rates. Everyone's free to improve on that core. The cloners can add hardware features and supply drivers. So can the mother ship. But each basic release of the OS works on the standard PPC/AT implementation. No more cherrypicking, no more trying to pass on the cost of hardware designs, just software licensing fees and market forces.

Unfortunately, there are several holes in this scenario. One is that Apple could fear its inability to provide competitive designs, picturing itself as overwhelmed by the cumulated energy and creativity of many third-party designers extending the platform. The corollary hole is that Apple could be tempted to design "private" features into the system software only accessible by its own hardware design, thus protecting a competitive advantage. From time to time, Microsoft is accused of doing this in Windows for its applications. In any event, Apple's current licensing program suffers from several genetic or structural defects. Apple licensing was not born out of the desire to make money in and of itself, and structural problems arise from the entanglement of hardware and software in licensing discussions.

Appointing CHRP (or any other design) as the PPC/AT, letting everyone build around it, and delivering, at last, a Mac OS version for CHRP, is Apple's way to free its software licensing from hardware complications. It will rekindle confidence and stimulate growth in the PowerPC industry. This might even free Apple, in turn, to copy the Microsoft model and build an immensely profitable application business on a thriving PPC niche and a NeXT-based Mac OS. Apple shareholders might like that kind of Microsoft cloning.


BeDevTalk Summary

BeDevTalk is an unmonitored discussion group in which technical information is shared by Be developers and interested parties. In this column, we summarize some of the active threads, listed by their subject lines as they appear, verbatim, in the mail.

To subscribe to BeDevTalk, visit the mailing list page on our web site: http://www.be.com/aboutbe/mailinglists.html.

WEEK 2

Subject: Fonts, fonts, fonts!

More requests:
Open the system so new font engines can be added as add-ons. Implement the alpha channel for alpha-defined anti-aliasing.

And discussion:
How should font color be handled? Should color data be recognized, or should fonts be defined as grayscale with alpha?

What about background color? A font engine that renders over a graphic (or other nonuniform color) needs more info than simply a solid background color.

Bitmap vs outline fonts:
The demise of bitmap fonts (in DR9) was lamented by a few. How can outline fonts look good in something like Terminal? Anti-aliasing and outline->bitmap conversion were offered as possible solutions. The pro-bitmap crowd felt that with all the good bitmaps that are currently available, conversion is unnecessary and wasteful.

THE BE LINE: Sorry, but add-on font engines will not be supported in DR9, nor will you be able to directly load bitmap fonts.

NEW

Subject: resolutions

Currently, Be defines screen sizes through a set of constants. But the world seems to be moving to more flexible resolutions. It was suggested that the constant-definition method be dropped in favor of an x/y/depth/refresh-rate encoding within a 64-bit value. Implementations and refinements were offered.

Relatedly, Chris Blackbourn asked for "screen quality" control in the Game Kit:

What I'd like is a Game Kit call that returns a screen of a certain 'speed.' Where speed=1.0 means 'gimme the fastest available screen you've got,' and speed=0.0 means 'give me the highest resolution screen you've got.'

THE BE LINE: An encoded resolution is a pretty good idea, and we're looking at the implications. However, we can't promise anything for DR9.

Subject: High Bandwidth multimedia

AKA: What is Be's role?

What's the ideal system for real-time processing of wide data? Is the BeOS it? And, more generally, what should be made of Be's espoused emphasis on multimedia content creation?

Technically, the argument centered on bandwidth vs. dedicated processing: Certain flavors of UNIX are great for pumping huge amounts of data, but with invisible daemons and uncertain scheduling, latency is hard to guarantee. "Not exactly so," said many UNIX supporters. They felt that UNIX doesn't deserve the unfriendly scheduling reputation: Many new flavors have emerged recently that CAN do real-time signal processing. Tangentially, the thread moved into a discussion of what constitutes UNIX.

On the philosophical side, a few listeners offered their interpretations of (and offered suggestions for) Be's direction. Eventually this discussion led back to analyses of Be's real-time abilities.

Subject: Time is running out!

Chris Herborth listed issues that need to be resolved before the BeOS can be considered forward-compatible and encouraged others to contribute. This turned into a general wish list as many of the requests were not strictly needed for forward compatibility. Some highlights:

  • More binary formats and an Application Binary Interface spec

  • Internationalization

  • Driver registry

  • Clear data format definitions and assumptions

  • Language independence

  • Multi-user support and security

  • More and better bitmap (as in BBitmap) support

The thread lurched into the perennial discussion of "how backwardly compatible should an app be?"

...do you use or know of anyone who uses a piece of software that has not been updated in 5 years? It's a very romantic thing to think that someone could write the perfect piece of software that could benefit from no updates... I'd say such a beast does not exist.

 

A new, slightly cynical, twist was added to the argument: Software companies LIKE incompatible OS updates because it means revenue.

Subject: addressing pixels

Center-based vs edge-based pixel coordinate systems. Which is better?

Creative Commons License
Legal Notice
This work is licensed under a Creative Commons Attribution-Non commercial-No Derivative Works 3.0 License.