Be Newsletters - Volume 5: 2000

Issue 5-3, January 19, 2000

Be Engineering Insights: A Simple Network Discovery Protocol

By Alfred Landrum

Imagine yourself in a completely dark room. You can't see anything, but you know other people are around you, because you can hear them talking. Occasionally, two people try to talk at the same time. This is followed by a short pause, and then the talking begins again. The conversations you hear are bizarre, almost nonsensical, like this:

"Hank! This is Howard! I have some data for you: #$@#$@%@"
"Joe! This is Al! What time is it?"
"Howard! This is Hank! I acknowledge data #$@#$@%@".
"Al! This is Joe! The time is January 3, 2000, 14:45 hours"
"Everyone! This is Jason! I'm here!"

This is one way I picture the network: A noisy dark room, with lots of shouting.

Our task for today is finding out who's in this room with you.

There are many applications that need a list of the people in this dark room. The MS Windows Network Neighborhood and the Macintosh Chooser are well known examples. They build a list of computers that are providing some kind of resource to other computers. In this article, I'll talk through one approach for network discovery, and you can find the source code for an implementation here: <ftp://ftp.be.com/pub/samples/network_kit/netdisc.zip>

Let's close our eyes, and put ourselves mentally in the dark room...

The first thing we'll do is to announce our presence. The last thing we'll do is to announce that we're leaving. Between our entrance and egress, we'll announce our existence every so often; if we suddenly drop off the network, everyone else will eventually notice. This is quite literally a heartbeat—if the others don't hear it, they assume we have died. If we announce too often, we waste bandwidth; too infrequently, and we'll show up in the other members' list long after we die. If we wait 5 minutes, and our announcement message is 100 bytes long, our protocol consumes .3 bytes per second per computer, which is entirely acceptable.

After we announce our presence, we'd like to fulfill our purpose and find out who else is around. We could wait a while, and eventually we'd hear everyone announce themselves. You probably don't want to wait that long. And, we need to remember that we may be the only one present, so we may never hear from anyone.

A naive way is to ask everyone to announce themselves at once. Obviously, this is hard on the network and, given the choice, it'd be easier if someone just handed us a list of computers, rather than trying to compile one ourselves. If we had a designated "special" computer, that we knew of beforehand, we could just ask it. This is a centralized solution, with all the normal centralized problems: What if your special computer is down? What if it can't deal with the barrage of requests it gets?

We could have the special computer chosen dynamically, through some type of election protocol. This is, in my opinion, worse. Election protocols are hard to implement, and harder to debug. This is not the answer either. What's left is a solution that doesn't depend on a single computer, whether statically or dynamically specified as "special."

Let me convince you that you already know the solution to this problem. Imagine, if you will, that you're at a party, conversing with friends. You're feeling comfortable, and the conversation is easy. Suddenly, a person, whom none of you know, walks up to your group, and without pausing to listen to the conversation, blurts out loudly, "What are you talking about?" (Are you picturing this? Concentrate.) Now, what happens?

In my experience (which I hope crosses cultural boundaries), there is a pregnant pause, as you and your friends try to decide whether to ignore the impolite intrusion. Then, someone in your group hesitantly offers the topic of conversation. This is our solution! In my analysis, here's what happened. Immediately after the outburst, everyone stops talking. There's a small amount of time when the members of your group look at each other, waiting to see if someone else says something, until finally someone does. Depending on how sociable your friends are, someone may answer quickly, or no one may answer at all. If no one answers, and if the annoying person wants the answer badly, they'll repeat themselves. Got it?

So, first you announce your presence on the network. Then, you broadcast a packet that means, "Who's there?" Every member individually decides whether or not they'll be the one to reply. Before they reply, they'll wait a bit to see if someone else has already done so. If someone else replies, there's no need to transmit. [An issue with this bit of the protocol is how to deal with other members who have just come up. Check the source code.]

How do the members decide whether they'll reply? Looking back at the party example, if your group is fairly large, there's a reasonable expectation that someone (and not you) will answer first. If your group is small, then there's a greater chance that you would have to answer.

So, whether you answer depends on the size of your group. Practically, you could decide that you have a 1/N chance of responding, where N is the size of the group. Or, you could increase that (by some constant factor) to ensure that a timely reply is made. If the "Who's there?" question is repeated, then you know that no one replied, and you increase your chance of responding. Eventually, the questioner will get a reply, which will be the list of current members.

Be Engineering Insights: Good MediaFile Behavior

By Dave Bort

PART I: tcode externals (the little part)

In my work with the media codecs, I've found the tcode utility to be an invaluable tool. You may have seen it sitting around in

/boot/optional/sample-code/media_kit/MediaFile/transcode/tcode.cpp

on your R4.5 CD, and you may have even played with it. In any case, I've got a new version for you here. Although it should compile and run on R4.5, it won't do you much good (except as an example of things to come) as the Maui codec add-ons have changed a lot.

<ftp://ftp.be.com/pub/samples/media_kit/transcode_r5.zip>

tcode [-info][-start frame-num][-end frame-num]
      [-avi|-qt|-quicktime|-wav|-aiff|-mp3]
      [-v <encoder_name>][-a <encoder_name>]
      <filename> [<output>]

But what does it do? tcode takes an input media file (e.g., QuickTime) with some sort of media encoding (e.g., IMA audio, Cinepak video) and translates (transcodes) it into a possibly different format/encoding (e.g., AVI file, raw audio, Indeo5 video). Of course, the proper encoders/decoders (codecs) need to be installed on your system to do so, and tcode can help you there as well with its -info switch. Here's what I have on my system:

$ tcode -info

Audio IFF format (AIFF) (AIFF, id 788608)
    Video Encoders:
    Audio Encoders:
        Raw Audio / raw-audio (9)
AVI File Format (avi, id 788609)
    Video Encoders:
        Cinepak Compression / cinepak (0)
        DV Compression by Canopus Inc. / dv (1)
        Indeo5 Video / iv50 (3)
        Photo-JPEG Compression / pjpeg (8)
        Raw Video / raw-video (9)
    Audio Encoders:
        Raw Audio / raw-audio (9)
MP3 Audio File Format (mp3, id 788610)
    Video Encoders:
    Audio Encoders:
        BladeEnc MP3 Encoder / mp3 (6)
QuickTime File Format (quicktime, id 787264)
    Video Encoders:
        Cinepak Compression / cinepak (0)
        DV Compression by Canopus Inc. / dv (1)
        Indeo5 Video / iv50 (3)
        Photo-JPEG Compression / pjpeg (8)
        Raw Video / raw-video (9)
    Audio Encoders:
        IMA Adaptive PCM / ima4 (2)
        Microsoft Adaptive PCM / ms-adpcm (7)
        Raw Audio / raw-audio (9)
RIFF Audio File Format (WAV) (wav, id 788705)
    Video Encoders:
    Audio Encoders:
        Microsoft Adaptive PCM / ms-adpcm (7)
        Raw Audio / raw-audio (9)

At the outer level are the supported file formats, and at the inner level are the installed encoders. Each format/encoder has a pretty name (e.g., "AVI File Format") and a short name (e.g., "avi"). tcode refers to the add-ons using their short names. The add-ons themselves live in /boot/beos/system/add-ons/media/{decoders,encoders, extractors,writers} or in a similar place under /boot/home/config/add-ons, so the extra-curious can poke around in there.

Using the above example where we have a QuickTime file csquares.mov and want to translate it into a raw-audio/Indeo5 AVI file, the command line is

tcode -avi -a raw-audio -v iv50 csquares.mov csquares.avi

which, after some churning, will produce the new AVI file csquares.avi. You can open it with MediaPlayer and it'll look and sound just like the original, except that it's now in an entirely different format. Crazy.

Note that tcode -info only displays a list of the installed encoders/writers. BeOS can read more formats than it can write, so tcode could take an MPEG1 or .au file as input and create a QuickTime or AIFF as output. Looking at the installed add-ons will give you a good idea of what tcode can accept as input.

Another important point is that not all file formats can accept all encodings. Our AVI writer, for example, only accepts raw audio, so if you want encoded audio with a movie you need to use QuickTime. And even then, you're stuck with IMA-encoded audio if you want to be able to play the movie on another platform.

Speaking of which, tcode allows you to create some format/encoding combinations that will play on BeOS but won't play on Windows or MacOS. For instance, to save space I tcoded a raw-audio/Cinepak AVI into an IMA/Indeo5 QuickTime, but didn't have any luck playing it under Win95.

You'll also notice the -start and -end switches to tcode. They allow you to specify the starting and ending video frames of the transcode, in case you'd like the output to be a clip of the input. These aren't as intuitive as they may seem, but they get the job done. First, they'll only work if the input file has a video stream and if you've specified the -v option on the command line. Second, audio is only handled properly if you've specified the -a option.

If you'd like to keep the encodings the same and simply produce a shortened file, you'll still need to specify the encodings explicitly. This is necessary because tcode needs to decode and re-encode the video stream in order to find the specified frames when it's dealing with keyframe-based encodings like MPEG1 and Cinepak. Still, it beats the built-in Linux video editing suite [dd(1)] hands-down.

It would be possible to avoid this step for formats without keyframes (e.g., Indeo5 or DV) but the simplicity of the transcode loop was too nice to ruin with special cases (i.e., I was lazy). Of course, tcode could be modified to do all of this format duplication internally, but that's exercise-for-the-reader stuff.

PART IIa: tcode internals (the medium part)

Dramatis Personae

Terms:

Writers:	Add-ons that know how to generate a specific file-format. We have writers for AIFF, AVI, QuickTime, and WAV.
Extractors:	Add-ons that know how to parse a specific file-format. We have extractors for AIFF, AU, AVI, AVR, DV, MPEG1, QuickTime, and WAV.
Encoders:	Add-ons that can convert raw audio or video into encoded data. We have encoders for Cinepak, DV, IMA, Indeo5, mp3, MS-ADPCM, and Photo-JPEG. There is also a null encoder for raw data; it's good for avoiding special cases in some encode loops.
Decoders:	Add-ons that can convert encoded media data into raw audio or video. We have decoders for Apple Video, CCITT-ADPCM, Cinepak, DV (audio and video), IMA, Indeo5, MPEG1 (audio and video), MS-ADPCM, MS RLE, MS Video, Photo-JPEG, and ulaw. There is also a null decoder for raw data.

Classes:

`BMediaFile`:	This class represents a file containing media, either for reading or writing. Media streams within the file are accessed using `BMediaTrack`s. (`MediaFile.h`)
`BMediaTrack`:	Represents a stream of media data, usually audio or video. Is used for both reading and writing. (`MediaTrack.h`)

Structures:

media_format:	The most-used (and -abused) beast in the Media Kit. This struct and its ilk describe the format of a media stream. At the top level is the 'type' field, which is usually set to one of `B_MEDIA_RAW_AUDIO`, `B_MEDIA_ENCODED_AUDIO`, `B_MEDIA_RAW_VIDEO`, or `B_MEDIA_ENCODED_VIDEO`. Depending on the type, an element of the union 'u' is used to further describe the media, such as `u.raw_audio` or `u.encoded_video`. (`MediaDefs.h`)
media_file_format:	Describes a file format (writer) such as AVI or QuickTime. It contains various names for the file type as well as some minimal indications of what can be written to the file. (`MediaDefs.h`)
media_codec_info:	Describes a codec (encoder/decoder) such as Indeo5 or mp3. It contains pretty and short names for the codec as well as an ID pointing to the encoder/decoder add-on it represents. (`MediaFormats.h`)

Functions:

get_next_file_format(int32 *cookie, media_file_format *mfi):

Walks through the installed writers, filling in the media_file_format each time. To find a specific writer, check one of the name fields; short_name is usually best, although mime_type and file_extension should do the job as well. (MediaDefs.h)

get_next_encoder(...):

Walks through the installed encoders that match the passed-in parameters, filling in the media_codec_info each time. You can specify the formats you'd like it to translate from/to, and which file format you'd like to write to. There are three different versions of this function, so choose wisely. In general; the more parameters you specify, the stricter the search. See the header file for specifics. (MediaFormats.h)

**Note: Although I have said "audio or video" in several places, the API does not restrict the data to these types. If you can dream up some sort of time/frame-based media, the API can handle it.

**Also Note: Decoders/Encoders do not necessarily need to translate to/from raw audio/video, but that's what they all currently do. There's nothing stopping a Decoder from translating MPEG to Cinepak, but most applications wouldn't be able to make much use of it; they tend to deal with raw media formats.

**Note as well, kind readers: Someday you'll all be able to write your own extractors/writers/encoders/decoders, but that day is not today. The internal codec API is still in flux and is not yet public.

And Now, It's

dump_info():

With that out of the way, we can look at tcode.cpp. Just to get the hang of things, we'll start out by seeing how tcode knows which encoders/writers are installed on your system when called with the '-info' switch. This is handled in dump_info(), and it's time for you to be amazed at how short it is. Notice that it uses all the structures and functions I named above, so it's a good place to start.

At a glance, you can easily see what dump_info() is doing:

for each writer
dump video encoders compatible with this file format
dump audio encoders compatible with this file format

The "for each writer" part is accomplished using get_next_file_format(). Assuming the cookie var is initialized to zero, get_next_file_format() will return a different media_file_format each time it is called. When there are no more writers, get_next_file_format() returns B_BAD_INDEX.

Now that we have a writer (described by 'mfi'), we can find the encoders compatible with it by using get_next_encoder(). As I said above, there are three versions of get_next_encoder(). dump_info() uses this one:

status_t
get_next_encoder(int32 *cookie,
                 const media_file_format *mfi,
                 const media_format *input_format,
                 media_format *output_format,
                 media_codec_info *ei);

This function works like get_next_file_format(), except that there's the option of restricting the results to match a certain media_file_format and media_format. mfi (the parameter) is the file format we want to match, input_format is the format of the media we have, output_format will be set by the encoder, and ei describes the encoder. When I say that output_format is set by the encoder, I mean: given the raw media described by input_format, the encoder will translate it into data described by output_format. It is output_format-type data that will actually be written to the file, so the writer must accept this format.

=== Sidebar: media_format pitfalls ===

For the purposes of dump_info(), we don't care too much about the media_format parameters, but we still need to specify them to differentiate between audio and video. And now for the cardinal media_format rule:

** Don't zero-out a media_format **

As with all c++ structs, media_format is a class in disguise; it just flaunts it more than usual. It has a constructor/destructor as well as some private data members, and thus a memset(&mf, 0, sizeof(media_format)) could potentially wipe out some important data. This is a known problem, and there are ways around it. When creating a media_format, the constructor zeros out the public bits for you. If you'd like to clear a media_format you've already used, follow the example of dump_info() by assigning the appropriate union member to the appropriate wildcard, or do a memset(&mf.u, 0, sizeof(mf.u)) which will zero out the whole union regardless of the media type. And in the same vein:

** Don't use malloc() for dynamic media_formats **

Say you want an array of media_formats. If you were to

media_format *mf =
    (media_format*)malloc(10 * sizeof(media_format));

then the aforementioned constructor would never get called. Instead, use the c++ construct

media_format *mf = new media_format[10];

and be sure to delete [] mf; when done. There probably is some tricksy c++ way to use a constructor on the results of malloc(), but I'm not filthy-minded enough to read that far into my Stroustrup.

All right. So, to get the video encoders we pass in a blank media_format with the type set to B_MEDIA_RAW_VIDEO, and to get the audio encoders we pass in a blank media_format with the type set to B_MEDIA_RAW_AUDIO. get_next_encoder() will only return those encoders that can encode input_format and whose results can be written to media_file_format mfi. From there, we can display the pretty table produced by tcode -info with a minimum of fuss.

Developers' Workshop: The Haydn Sample Code Projects

By Owen Smith

With the upcoming release of the BeOS comes a brand spanking-new Midi Kit -- actually, despite the way this kit is traditionally capitalized, that's not "Le Midi," as in Haydn's Symphony #7, but rather MIDI, the link between you and those synthesizers, drum machines, and other sound-spewing instruments with 5-pin DIN connections that you may have tucked into the corners of various rooms in your household. A little while back I wrought two pieces of sample code that demonstrate the abilities of the new MIDI Kit, and I more or less concealed them from the world at large, in the perpetual gloom of the private DTS FTP site. Now, the time has arrived for these pieces of sample code to see the light of day. Some of you have already seen this code—so, to keep your interest, I'll even describe how the new Midi Kit works.

Sights and Sounds of the New Midi Kit

The first and most important point to note is that BMidiConsumer and BMidiProducer in the Midi Kit are NOT directly analogous to BBufferConsumer and BBufferProducer in the Media Kit! In the Media Kit, consumers and producers are the data consuming and producing properties of a media node. A filter in the Media Kit, therefore, inherits from both BBufferConsumer and BBufferProducer, and implements their virtual member functions to do its work. In the Midi Kit, consumers and producers act as endpoints of MIDI data connections, much as media_source and media_destination do in the Media Kit. Thus, a MIDI filter does not derive from BMidiConsumer and BMidiProducer; instead, it contains BMidiConsumer and BMidiProducer objects for each of its distinct endpoints that connect to other MIDI objects. This also contrasts with the old Midi Kit's conception of a BMidi object, which stood for an object that both received and sent MIDI data. In the new Midi Kit, the endpoints of MIDI connections are all that matters -- what lies between the endpoints, i.e., how a MIDI filter is actually structured, is entirely at your discretion.

The second thing to note about the Midi Kit is the distinction between remote and local MIDI objects. If you look at BMidiConsumer, for example, you'll see that there just ain't that much there in terms of implementation API. On the other hand, just below the declaration for BMidiConsumer in MidiConsumer.h lies the declaration for BMidiLocalConsumer, which has all the hook functions that you'd expect to see in a consumer object. This dualism comes from the way that the Midi Roster deals with remote objects. All MIDI endpoints that you create derive from BMidiLocalConsumer and BMidiLocalProducer, and their member functions work the way you'd expect. In order to hide the details of communication with MIDI endpoints in other applications, however, the Midi Kit must hide the details of how a particular endpoint is implemented.

Thus, the Midi Roster only gives you access to BMidiEndpoints, BMidiConsumers, and BMidiProducers, so these are the classes you'll be working with when you want access to MIDI objects in other applications. So, what can you do with remote objects? Only what BMidiConsumer, BMidiProducer, and BMidiEndpoint will let you do. You can connect objects, get properties of these objects—and that's about it.

The final thing I want to note about the Midi Kit is the reference counting scheme. Each MIDI endpoint has a reference count associated with it, so that the bookkeeping associated with the endpoints is correct. When you construct an endpoint, it starts with a count of 1. Once the count hits 0, the endpoint will be deleted. This means that, to delete an endpoint, you don't call the delete operator directly; instead, you call BMidiEndpoint::Release(). To balance this call, there's also a BMidiEndpoint::Acquire(), in case you have two disparate parts of your application working with the endpoint, and you don't want to have to keep track of who needs to Release() the endpoint.

The trick in reference counting that trips up some people is that the Midi Roster increments the reference count of any object it hands to you, as a result of NextEndpoint() or FindEndpoint(). So, when you're done with any object the Midi Roster gives you, you must Release() it! This lets you treat remote objects and local objects the same way with respect to reference counting. Repeat after me: Release() when you're done.

Transposer: The Model MIDI Filter

I remember once being forced to sit at a piano in one of my music classes and transpose Bach's Prelude in C major (BWV 846) to F# major on the fly in front of the class. Mathematically, this reduces to a trivial mapping: just raise each note by six half-steps. Doing this while playing, however, is considerably less trivial. As a firm believer in the "work smarter, not harder" ethic, therefore, I commend the following utility to your attention for when you're called upon to transpose music on a MIDI instrument:

<ftp://ftp.be.com/pub/samples/midi_kit/Transposer.zip>

This is a MIDI filter that transposes incoming events by a variable amount and sends them out the output—about as simple a filter as there is.

The base class that I use in Transposer to represent a MIDI filter—and a class that I hope will be useful for you as well!—is called SimpleMidiFilter. It contains exactly one BMidiLocalConsumer and one BMidiLocalProducer. When data arrives at the input, one of a number of filter functions gets called in the filter class, analogous to the filter operation in BMessageFilter. You override any of these functions to handle a specific kind of event however you want, sending events to the output if necessary, and return a code to indicate whether the original event should be sent to the input or not. If you don't override the functions, they return the default action (which is to send the original event to the output without modification, though you can specify a different default action if you want).

Note that the consumer endpoint of a SimpleMidiFilter derives from BMidiLocalConsumer, whereas the procurer endpoint is simply a BMidiLocalProducer. This is a common configuration, because consumers work by overriding the event hooks to do work when MIDI data arrives (in the case of SimpleMidiFilter, the consumer needs to call SimpleMidiFilter hook functions), whereas producers work by sending an event when you call their member functions. You should never derive from BMidiLocalProducer (unless you need to know when the producer gets connected or disconnected, perhaps), but you'll always have to override one or more of BMidiLocalConsumer's member functions to do something useful with incoming data.

Dokken in the Patch Bay

The second piece of code I've promised you is a handy tool for browsing the MIDI objects in your system and connecting them together. It, too, is now available at participating FTP sites:

<ftp://ftp.be.com/pub/samples/midi_kit/PatchBay.zip>

The interface for PatchBay is crude yet effective. Along the top is a row of icons representing the consumers in your system; along the left-hand side is a row of icons representing the producers in your system, and a wee meter that gives you an idea of the number of events emanating from that producer. Move your mouse over an icon, and a tool tip pops up to give you more information about the object, like the ID and name of the object.

Between these icons is a matrix of check-boxes that represent connections between the MIDI objects in your system. Click on a check box to connect two MIDI objects together (such as running Dokken guitar riffs from your MIDI port into a Transposer instance). Click on the check box again to disconnect the two objects.

The magic behind the implementation of PatchBay comes from the BMidiRoster::StartWatching() function. You call this function and specify a BMessenger that you want to receive MIDI notifications (in this case, the messenger points to our top-level view, the PatchView). When you start watching, the BMidiRoster starts by sending you notifications for all currently registered nodes, and all the current connections between objects. Thereafter, you'll receive notifications any time something important happens to an object (objects registered/connected, objects disconnected/unregistered, properties changing, etc.). PatchBay uses these notifications to maintain an up-to-date view of the state of MIDI objects in your system.

Some of you may be wondering where the icons come from. Well, there's a mechanism in the new Midi Kit for storing properties associated with an endpoint in a BMessage. This can include any kind of information that might be useful to associate with a MIDI object. If you look in Transposer, you'll see that it stores a large and mini icon as properties in the BMessage. Feel free to use properties however you want—if you come up with something particularly useful, let us know!