Imagine yourself in a completely dark room. You can't see anything, but you know other people are around you, because you can hear them talking. Occasionally, two people try to talk at the same time. This is followed by a short pause, and then the talking begins again. The conversations you hear are bizarre, almost nonsensical, like this:
"Hank! This is Howard! I have some data for you: #$@#$@%@"
"Joe! This is Al! What time is it?"
"Howard! This is Hank! I acknowledge data #$@#$@%@".
"Al! This is Joe! The time is January 3, 2000, 14:45 hours"
"Everyone! This is Jason! I'm here!"
This is one way I picture the network: A noisy dark room, with lots of shouting.
Our task for today is finding out who's in this room with you.
There are many applications that need a list of the people in this dark room. The MS Windows Network Neighborhood and the Macintosh Chooser are well known examples. They build a list of computers that are providing some kind of resource to other computers. In this article, I'll talk through one approach for network discovery, and you can find the source code for an implementation here: <ftp://ftp.be.com/pub/samples/network_kit/netdisc.zip>
Let's close our eyes, and put ourselves mentally in the dark room...
The first thing we'll do is to announce our presence. The last thing we'll do is to announce that we're leaving. Between our entrance and egress, we'll announce our existence every so often; if we suddenly drop off the network, everyone else will eventually notice. This is quite literally a heartbeat—if the others don't hear it, they assume we have died. If we announce too often, we waste bandwidth; too infrequently, and we'll show up in the other members' list long after we die. If we wait 5 minutes, and our announcement message is 100 bytes long, our protocol consumes .3 bytes per second per computer, which is entirely acceptable.
After we announce our presence, we'd like to fulfill our purpose and find out who else is around. We could wait a while, and eventually we'd hear everyone announce themselves. You probably don't want to wait that long. And, we need to remember that we may be the only one present, so we may never hear from anyone.
A naive way is to ask everyone to announce themselves at once. Obviously, this is hard on the network and, given the choice, it'd be easier if someone just handed us a list of computers, rather than trying to compile one ourselves. If we had a designated "special" computer, that we knew of beforehand, we could just ask it. This is a centralized solution, with all the normal centralized problems: What if your special computer is down? What if it can't deal with the barrage of requests it gets?
We could have the special computer chosen dynamically, through some type of election protocol. This is, in my opinion, worse. Election protocols are hard to implement, and harder to debug. This is not the answer either. What's left is a solution that doesn't depend on a single computer, whether statically or dynamically specified as "special."
Let me convince you that you already know the solution to this problem. Imagine, if you will, that you're at a party, conversing with friends. You're feeling comfortable, and the conversation is easy. Suddenly, a person, whom none of you know, walks up to your group, and without pausing to listen to the conversation, blurts out loudly, "What are you talking about?" (Are you picturing this? Concentrate.) Now, what happens?
In my experience (which I hope crosses cultural boundaries), there is a pregnant pause, as you and your friends try to decide whether to ignore the impolite intrusion. Then, someone in your group hesitantly offers the topic of conversation. This is our solution! In my analysis, here's what happened. Immediately after the outburst, everyone stops talking. There's a small amount of time when the members of your group look at each other, waiting to see if someone else says something, until finally someone does. Depending on how sociable your friends are, someone may answer quickly, or no one may answer at all. If no one answers, and if the annoying person wants the answer badly, they'll repeat themselves. Got it?
So, first you announce your presence on the network. Then, you broadcast a packet that means, "Who's there?" Every member individually decides whether or not they'll be the one to reply. Before they reply, they'll wait a bit to see if someone else has already done so. If someone else replies, there's no need to transmit. [An issue with this bit of the protocol is how to deal with other members who have just come up. Check the source code.]
How do the members decide whether they'll reply? Looking back at the party example, if your group is fairly large, there's a reasonable expectation that someone (and not you) will answer first. If your group is small, then there's a greater chance that you would have to answer.
So, whether you answer depends on the size of your group. Practically, you could decide that you have a 1/N chance of responding, where N is the size of the group. Or, you could increase that (by some constant factor) to ensure that a timely reply is made. If the "Who's there?" question is repeated, then you know that no one replied, and you increase your chance of responding. Eventually, the questioner will get a reply, which will be the list of current members.
In my work with the media codecs, I've found the tcode utility to be an invaluable tool. You may have seen it sitting around in
/boot/optional/sample-code/media_kit/MediaFile/transcode/tcode.cpp
on your R4.5 CD, and you may have even played with it. In any case, I've got a new version for you here. Although it should compile and run on R4.5, it won't do you much good (except as an example of things to come) as the Maui codec add-ons have changed a lot.
<ftp://ftp.be.com/pub/samples/media_kit/transcode_r5.zip>
tcode [-info][-start frame-num][-end frame-num] [-avi|-qt|-quicktime|-wav|-aiff|-mp3] [-v <encoder_name>][-a <encoder_name>] <filename> [<output>]
But what does it do? tcode takes an input media file (e.g., QuickTime) with some sort of media encoding (e.g., IMA audio, Cinepak video) and translates (transcodes) it into a possibly different format/encoding (e.g., AVI file, raw audio, Indeo5 video). Of course, the proper encoders/decoders (codecs) need to be installed on your system to do so, and tcode can help you there as well with its -info switch. Here's what I have on my system:
$ tcode -info Audio IFF format (AIFF) (AIFF, id 788608) Video Encoders: Audio Encoders: Raw Audio / raw-audio (9) AVI File Format (avi, id 788609) Video Encoders: Cinepak Compression / cinepak (0) DV Compression by Canopus Inc. / dv (1) Indeo5 Video / iv50 (3) Photo-JPEG Compression / pjpeg (8) Raw Video / raw-video (9) Audio Encoders: Raw Audio / raw-audio (9) MP3 Audio File Format (mp3, id 788610) Video Encoders: Audio Encoders: BladeEnc MP3 Encoder / mp3 (6) QuickTime File Format (quicktime, id 787264) Video Encoders: Cinepak Compression / cinepak (0) DV Compression by Canopus Inc. / dv (1) Indeo5 Video / iv50 (3) Photo-JPEG Compression / pjpeg (8) Raw Video / raw-video (9) Audio Encoders: IMA Adaptive PCM / ima4 (2) Microsoft Adaptive PCM / ms-adpcm (7) Raw Audio / raw-audio (9) RIFF Audio File Format (WAV) (wav, id 788705) Video Encoders: Audio Encoders: Microsoft Adaptive PCM / ms-adpcm (7) Raw Audio / raw-audio (9)
At the outer level are the supported file formats, and at the inner level
are the installed encoders. Each format/encoder has a pretty name (e.g.,
"AVI File Format") and a short name (e.g., "avi").
tcode refers to the
add-ons using their short names. The add-ons themselves live in
/boot/beos/system/add-ons/media/{decoders,encoders, extractors,writers}
or in a similar place under /boot/home/config/add-ons
, so the
extra-curious can poke around in there.
Using the above example where we have a QuickTime file csquares.mov
and
want to translate it into a raw-audio/Indeo5 AVI file, the command line is
tcode -avi -a raw-audio -v iv50 csquares.mov csquares.avi
which, after some churning, will produce the new AVI file csquares.avi
.
You can open it with MediaPlayer and it'll look and sound just like the
original, except that it's now in an entirely different format. Crazy.
Note that tcode -info
only displays a list of the installed
encoders/writers. BeOS can read more formats than it can write, so tcode
could take an MPEG1 or .au
file as input and create a QuickTime or AIFF
as output. Looking at the installed add-ons will give you a good idea of
what tcode can accept as input.
Another important point is that not all file formats can accept all encodings. Our AVI writer, for example, only accepts raw audio, so if you want encoded audio with a movie you need to use QuickTime. And even then, you're stuck with IMA-encoded audio if you want to be able to play the movie on another platform.
Speaking of which, tcode allows you to create some format/encoding combinations that will play on BeOS but won't play on Windows or MacOS. For instance, to save space I tcoded a raw-audio/Cinepak AVI into an IMA/Indeo5 QuickTime, but didn't have any luck playing it under Win95.
You'll also notice the -start and -end switches to tcode. They allow you to specify the starting and ending video frames of the transcode, in case you'd like the output to be a clip of the input. These aren't as intuitive as they may seem, but they get the job done. First, they'll only work if the input file has a video stream and if you've specified the -v option on the command line. Second, audio is only handled properly if you've specified the -a option.
If you'd like to keep the encodings the same and simply produce a shortened file, you'll still need to specify the encodings explicitly. This is necessary because tcode needs to decode and re-encode the video stream in order to find the specified frames when it's dealing with keyframe-based encodings like MPEG1 and Cinepak. Still, it beats the built-in Linux video editing suite [dd(1)] hands-down.
It would be possible to avoid this step for formats without keyframes (e.g., Indeo5 or DV) but the simplicity of the transcode loop was too nice to ruin with special cases (i.e., I was lazy). Of course, tcode could be modified to do all of this format duplication internally, but that's exercise-for-the-reader stuff.
Writers: | Add-ons that know how to generate a specific file-format. We have writers for AIFF, AVI, QuickTime, and WAV. |
Extractors: | Add-ons that know how to parse a specific file-format. We have extractors for AIFF, AU, AVI, AVR, DV, MPEG1, QuickTime, and WAV. |
Encoders: | Add-ons that can convert raw audio or video into encoded data. We have encoders for Cinepak, DV, IMA, Indeo5, mp3, MS-ADPCM, and Photo-JPEG. There is also a null encoder for raw data; it's good for avoiding special cases in some encode loops. |
Decoders: | Add-ons that can convert encoded media data into raw audio or video. We have decoders for Apple Video, CCITT-ADPCM, Cinepak, DV (audio and video), IMA, Indeo5, MPEG1 (audio and video), MS-ADPCM, MS RLE, MS Video, Photo-JPEG, and ulaw. There is also a null decoder for raw data. |
|
This class represents a file containing media, either for
reading or writing. Media streams within the file are accessed using
|
|
Represents a stream of media data, usually audio or video.
Is used for both reading and writing. ( |
media_format: |
The most-used (and -abused) beast in the Media Kit. This
struct and its ilk describe the format of a media stream. At the top
level is the 'type' field, which is usually set to one of
|
media_file_format: |
Describes a file format (writer) such as AVI or
QuickTime. It contains various names for the file type as well as some
minimal indications of what can be written to the file.
( |
media_codec_info: |
Describes a codec (encoder/decoder) such as Indeo5 or
mp3. It contains pretty and short names for the codec as well as an ID
pointing to the encoder/decoder add-on it represents.
( |
|
Walks through the installed writers, filling in the media_file_format
each time. To find a specific writer, check one of the name fields;
|
|
Walks through the installed encoders that match the passed-in parameters,
filling in the media_codec_info each time. You can specify the formats
you'd like it to translate from/to, and which file format you'd like to
write to. There are three different versions of this function, so choose
wisely. In general; the more parameters you specify, the stricter the
search. See the header file for specifics. ( |
**Note: Although I have said "audio or video" in several places, the API does not restrict the data to these types. If you can dream up some sort of time/frame-based media, the API can handle it.
**Also Note: Decoders/Encoders do not necessarily need to translate to/from raw audio/video, but that's what they all currently do. There's nothing stopping a Decoder from translating MPEG to Cinepak, but most applications wouldn't be able to make much use of it; they tend to deal with raw media formats.
**Note as well, kind readers: Someday you'll all be able to write your own extractors/writers/encoders/decoders, but that day is not today. The internal codec API is still in flux and is not yet public.
With that out of the way, we can look at tcode.cpp
. Just to get the hang
of things, we'll start out by seeing how tcode knows which
encoders/writers are installed on your system when called with the
'-info' switch. This is handled in dump_info()
, and it's time for you to
be amazed at how short it is. Notice that it uses all the structures and
functions I named above, so it's a good place to start.
At a glance, you can easily see what dump_info()
is doing:
for each writer
dump video encoders compatible with this file format
dump audio encoders compatible with this file format
The "for each writer" part is accomplished using get_next_file_format()
.
Assuming the cookie var is initialized to zero, get_next_file_format()
will return a different media_file_format each time it is called. When
there are no more writers, get_next_file_format()
returns B_BAD_INDEX
.
Now that we have a writer (described by 'mfi'), we can find the encoders
compatible with it by using get_next_encoder()
. As I said above, there
are three versions of get_next_encoder()
. dump_info()
uses this one:
status_tget_next_encoder
(int32 *cookie
, const media_file_format *mfi
, const media_format *input_format
, media_format *output_format
, media_codec_info *ei
);
This function works like get_next_file_format()
,
except that there's the option of restricting the results to match a
certain media_file_format and media_format.
mfi
(the parameter) is the file format we want to
match, input_format
is the format of the media we
have, output_format
will be set by the encoder, and
ei
describes the encoder. When I say that
output_format
is set by the encoder, I mean: given
the raw media described by input_format
, the encoder
will translate it into data described by
output_format
. It is
output_format
-type data that will actually be
written to the file, so the writer must accept this format.
=== Sidebar: media_format pitfalls ===
For the purposes of dump_info()
, we don't care too much about the
media_format parameters, but we still need to specify them to
differentiate between audio and video. And now for the cardinal
media_format rule:
** Don't zero-out a media_format **
As with all c++ structs, media_format is a class in disguise; it just
flaunts it more than usual. It has a constructor/destructor as well as
some private data members, and thus a memset(&mf, 0,
sizeof(media_format))
could potentially wipe out some important data.
This is a known problem, and there are ways around it. When creating a
media_format, the constructor zeros out the public bits for you. If you'd
like to clear a media_format you've already used, follow the example of
dump_info()
by assigning the appropriate union member to the appropriate
wildcard, or do a memset(&mf.u, 0, sizeof(mf.u))
which will zero out the
whole union regardless of the media type. And in the same vein:
** Don't use malloc()
for dynamic media_formats **
Say you want an array of media_formats. If you were to
media_format *mf
= (media_format*)malloc
(10 *sizeof
(media_format));
then the aforementioned constructor would never get called. Instead, use the c++ construct
media_format *mf
= new media_format[10];
and be sure to delete [] mf;
when done. There probably is some tricksy
c++ way to use a constructor on the results of malloc()
, but I'm not
filthy-minded enough to read that far into my Stroustrup.
All right. So, to get the video encoders we pass in a blank media_format
with the type set to B_MEDIA_RAW_VIDEO
, and to get the audio encoders we
pass in a blank media_format with the type set to B_MEDIA_RAW_AUDIO
.
get_next_encoder()
will only return those encoders that can encode
input_format
and whose results can be written to media_file_format
mfi
. From there, we can display the pretty table produced by tcode
-info
with a minimum of fuss.
With the upcoming release of the BeOS comes a brand spanking-new Midi Kit -- actually, despite the way this kit is traditionally capitalized, that's not "Le Midi," as in Haydn's Symphony #7, but rather MIDI, the link between you and those synthesizers, drum machines, and other sound-spewing instruments with 5-pin DIN connections that you may have tucked into the corners of various rooms in your household. A little while back I wrought two pieces of sample code that demonstrate the abilities of the new MIDI Kit, and I more or less concealed them from the world at large, in the perpetual gloom of the private DTS FTP site. Now, the time has arrived for these pieces of sample code to see the light of day. Some of you have already seen this code—so, to keep your interest, I'll even describe how the new Midi Kit works.
The first and most important point to note is that
BMidiConsumer
and
BMidiProducer
in the Midi Kit are NOT directly
analogous to BBufferConsumer
and
BBufferProducer
in the Media Kit! In the Media Kit,
consumers and producers are the data consuming and producing properties of
a media node. A filter in the Media Kit, therefore, inherits from both
BBufferConsumer
and
BBufferProducer
, and implements their virtual member
functions to do its work. In the Midi Kit, consumers and producers act as
endpoints of MIDI data connections, much as
media_source and media_destination do in the
Media Kit. Thus, a MIDI filter does not derive from
BMidiConsumer
and
BMidiProducer
; instead, it contains
BMidiConsumer
and
BMidiProducer
objects for each of its distinct
endpoints that connect to other MIDI objects. This also
contrasts with the old Midi Kit's conception of a
BMidi
object, which stood for an object that both
received and sent MIDI data. In the new Midi Kit, the
endpoints of MIDI connections are all that matters --
what lies between the endpoints, i.e., how a MIDI filter
is actually structured, is entirely at your discretion.
The second thing to note about the Midi Kit is the distinction between
remote and local MIDI objects. If you look at
BMidiConsumer
, for example, you'll see that there
just ain't that much there in terms of implementation API. On the other
hand, just below the declaration for BMidiConsumer
in MidiConsumer.h
lies the
declaration for BMidiLocalConsumer
, which has all
the hook functions that you'd expect to see in a consumer object. This
dualism comes from the way that the Midi Roster deals with remote objects.
All MIDI endpoints that you create derive from
BMidiLocalConsumer
and
BMidiLocalProducer
, and their member functions work
the way you'd expect. In order to hide the details of communication with
MIDI endpoints in other applications, however, the Midi
Kit must hide the details of how a particular endpoint is implemented.
Thus, the Midi Roster only gives you access to
BMidiEndpoint
s,
BMidiConsumer
s, and
BMidiProducer
s, so these are the classes you'll be
working with when you want access to MIDI objects in
other applications. So, what can you do with remote objects? Only what
BMidiConsumer
, BMidiProducer
,
and BMidiEndpoint
will let you do. You can connect
objects, get properties of these objects—and that's about it.
The final thing I want to note about the Midi Kit is the reference
counting scheme. Each MIDI endpoint has a reference count associated with
it, so that the bookkeeping associated with the endpoints is correct.
When you construct an endpoint, it starts with a count of 1. Once the
count hits 0, the endpoint will be deleted. This means that, to delete an
endpoint, you don't call the delete operator directly; instead, you call
BMidiEndpoint::Release()
. To balance this call, there's also a
BMidiEndpoint::Acquire()
, in case you have two disparate parts of your
application working with the endpoint, and you don't want to have to keep
track of who needs to Release()
the endpoint.
The trick in reference counting that trips up some people is that the
Midi Roster increments the reference count of any object it hands to you,
as a result of NextEndpoint()
or
FindEndpoint()
. So, when you're done with any
object the Midi Roster gives you, you must
Release()
it! This lets you treat remote objects
and local objects the same way with respect to reference counting. Repeat
after me: Release()
when you're done.
I remember once being forced to sit at a piano in one of my music classes and transpose Bach's Prelude in C major (BWV 846) to F# major on the fly in front of the class. Mathematically, this reduces to a trivial mapping: just raise each note by six half-steps. Doing this while playing, however, is considerably less trivial. As a firm believer in the "work smarter, not harder" ethic, therefore, I commend the following utility to your attention for when you're called upon to transpose music on a MIDI instrument:
<ftp://ftp.be.com/pub/samples/midi_kit/Transposer.zip>
This is a MIDI filter that transposes incoming events by a variable amount and sends them out the output—about as simple a filter as there is.
The base class that I use in Transposer to represent a
MIDI filter—and a class that I hope will be useful
for you as well!—is called SimpleMidiFilter
. It
contains exactly one BMidiLocalConsumer
and one
BMidiLocalProducer
. When data arrives at the input,
one of a number of filter functions gets called in the filter class,
analogous to the filter operation in BMessageFilter
.
You override any of these functions to handle a specific kind of event
however you want, sending events to the output if necessary, and return a
code to indicate whether the original event should be sent to the input or
not. If you don't override the functions, they return the default action
(which is to send the original event to the output without modification,
though you can specify a different default action if you want).
Note that the consumer endpoint of a
SimpleMidiFilter
derives from
BMidiLocalConsumer
, whereas the procurer endpoint is
simply a BMidiLocalProducer
. This is a common
configuration, because consumers work by overriding the event hooks to do
work when MIDI data arrives (in the case of
SimpleMidiFilter
, the consumer needs to call
SimpleMidiFilter
hook functions), whereas producers
work by sending an event when you call their member functions. You should
never derive from BMidiLocalProducer
(unless you
need to know when the producer gets connected or disconnected, perhaps),
but you'll always have to override one or more of
BMidiLocalConsumer
's member functions to do
something useful with incoming data.
The second piece of code I've promised you is a handy tool for browsing the MIDI objects in your system and connecting them together. It, too, is now available at participating FTP sites:
<ftp://ftp.be.com/pub/samples/midi_kit/PatchBay.zip>
The interface for PatchBay is crude yet effective. Along the top is a row of icons representing the consumers in your system; along the left-hand side is a row of icons representing the producers in your system, and a wee meter that gives you an idea of the number of events emanating from that producer. Move your mouse over an icon, and a tool tip pops up to give you more information about the object, like the ID and name of the object.
Between these icons is a matrix of check-boxes that represent connections
between the MIDI objects in your system. Click on a
check box to connect two MIDI objects together (such as
running Dokken guitar riffs from your MIDI port into a
Transposer
instance). Click on the check box again
to disconnect the two objects.
The magic behind the implementation of PatchBay
comes from the BMidiRoster::StartWatching()
function. You call this function and specify a
BMessenger
that you want to receive
MIDI notifications (in this case, the messenger points
to our top-level view, the PatchView
). When you
start watching, the BMidiRoster
starts by sending
you notifications for all currently registered nodes, and all the current
connections between objects. Thereafter, you'll receive notifications any
time something important happens to an object (objects
registered/connected, objects disconnected/unregistered, properties
changing, etc.). PatchBay uses these
notifications to maintain an up-to-date view of the state of
MIDI objects in your system.
Some of you may be wondering where the icons come from. Well, there's a
mechanism in the new Midi Kit for storing properties associated with an
endpoint in a BMessage
. This can include any kind of information that
might be useful to associate with a MIDI object. If you look in
Transposer
, you'll see that it stores a large and mini icon as properties
in the BMessage
. Feel free to use properties however you want—if you
come up with something particularly useful, let us know!