Issue 3-51, December 23, 1998

Be Engineering Insights: QuickerPaint

By George Hoffman

The Bay Area is tied more precariously to the diurnal cycle than any other location in the civilized world. Silicon Valley buzzes with geeks staying up all night, hacking away at a problem to meet a beta release deadline the next morning. Geeks on insanely optimistic production schedules that are somehow met when entire companies pull twenty-hour days for weeks at a time.

Now. Tell me why it is that at 2:00 a.m. on a Tuesday morning, as I sit here writing sample code for all of my loyal readers (you're both still out there, right?), I am unable to order a pizza from any of the lumpty- zillion pizza places in the phone book, because they all stop delivery at 11 p.m.? It's probably the same reason why all movie theaters in the Bay Area run their last show at 10:30 p.m. Huh? Am I the only one who is mystified by this? If I had any brains at all, I'd get out of the software business and open up a chain of 24-hour pizzeria/movie theaters.

Ah, there I go again. You'll have to excuse me. There are times in the development cycle of any product when it just doesn't seem to be coming together. You add feature after feature, fix bug after bug, insert moronic Easter Egg after countless moronic Easter Egg, but the product just doesn't seem to cohere. Its shiny surface, if you will, is not congealing from the smelly, festering pit of your code base. These are dark times for any developer. I have reached such a nadir.

The product, of course, is QuickPaint.

Yes, QuickPaint, the standard BeOS image editor, much like "ed" is the standard UNIX text editor. QuickPaint is the standard by which all other BeOS image editors are judged, and boy, are they all looking good!

Some of you may remember QuickPaint from Be Engineering Insights: Cop-Out in which I introduced it as sample code. Since then, I've promised sample code for using the new BeOS Release 4 features I wrote about in Be Engineering Insights: That BeOS is one baaad mother-[Shut your mouth!] ...just talking 'bout BeOS

Could there be a better way to demonstrate these new features than integrating them into the Be Developer Newsletter's own Photoshop-killer, QuickPaint? Probably! But I did it anyway!

Among the new QuickPaint features are support for layers, anti-aliased pen strokes, image import, and several new tools. Here's where to get it:

ftp://ftp.be.com/pub/samples/graphics/QuickPaint.zip

You need R4 to compile and/or run it. The makefile included is strictly bare bones and doesn't handle dependencies correctly (again, because I am lazy) so you'll probably want to write your own makefile or use a BeIDE project to make small changes, in order to be sure that all necessary files are recompiled.

The UI is easy enough to use, although completely un-self-explanatory. Basically, you choose a tool and start drawing. The pen tool is now anti-aliased and thick; other drawing tools also use thick pen sizes. The two new tools are a transparentizing pen (the inverted pen icon) and a "hand" tool to move layers around (that's the big X, because I didn't want to bother coming up with an icon).

To add a layer, either hit Alt+L or choose it from the Layers menu, or drag and drop an image onto QuickPaint from the Tracker. You can also choose Add Image from the File menu and that image will be added in another layer. The layer stack appears along the left side of every editor window. To select a layer for drawing, left-click it; its button will be pushed in, indicating the selection. To hide/show a layer, right-click it. To change the order of the layers, left-click and drag the layer button to the new location in the stack. Clickable arrows will appear above or below the stack window if you can scroll to see more layers.

The changes to the QuickPaint sources are evident enough; the code has bloated up to almost 3,000 lines. But in those lines you'll find demonstrations of some of the coolest and most useful new app_server features.

For instance, the new alpha-blending capabilities are used throughout the program. You'll find usage of the "normal" B_ALPHA_OVERLAY blending mode in Layers.cpp (to display the strike-out mark over the buttons of nonvisible layers) and in BitmapDocument.cpp, where the layer composition is done. The "weird" and more expensive B_ALPHA_COMPOSITE mode is demonstrated in action in ToolLib.cpp, where the new anti-aliased pen lives. To see why it's needed, change the mode to be B_ALPHA_OVERLAY and draw a bit with the pen. Try drawing different colors on top of one another in the same layer. The effect is interesting, but probably not what you want. B_ALPHA_COMPOSITE preserves all the alpha information -- as you'd expect—and is designed explicitly for this kind of thing: alpha-compositing images onto an offscreen-for-later alpha overlay. There's also a bit of explicit manual fondling of alpha data in the "hand" tool—layers that are dragged become translucent so you can see the layers underneath for the duration of the drag. This is done by manually setting the alpha values of pixels—an easy, safe, and useful technique, as long as you're sure of your pixel formats.

Other new APIs are scattered here and there. The new ClipToPicture() API is used in a lightweight way to provide a mask for the strike-out marks on the layer buttons that I mentioned above. The view transaction methods are used in LayoutViews() in Layers.cpp while laying out the layer views on the matrix buttons. All the interface controls and drawing tools are now asynchronous (i.e., they don't do polling, but instead rely on MouseMoved() and MouseUp()). Among the drawing tools some choose to receive pointer movement history and others choose to throw it away, according to the needs of the specific tool.

Beware that there are bugs in some of the translators that improperly set the alpha of pixels read from an imported file. The Be logo in the SampleMedia folder, for instance, loads without problems, but as all the pixels have an alpha of zero, the new layer appears blank. These bugs will be fixed for R4.1. Also fixed for R4.1 will be the conspicuous absence of alpha controls on the BColorSelector; in fact, the colors returned by it are also incorrect and have the alpha set to zero. QuickPaint sets it back to 255 and does without translucent primitives for now.

I welcome any feedback from developers about how useful all this QuickPaint nonsense is, and any questions about how the new APIs are used in general or specifically in QuickPaint. I'd also like to point out that I haven't implemented any of the things I suggested last time as "exercises for the reader," so you can simply merge all the code you've written for those exercises to the new source base and be ready to rip! Heh-heh-heh. Do you feel guilty? Good!

In closing, I'd like to take this opportunity to welcome Ficus Kirkpatrick to the growing, tightly integrated QuickPaint Application Suite Family. We here at QuickPaint, Inc,. are proud to add Whack to our offerings as the flagship product of our lucrative Really Cool Useless Crap division. As a result of this merger, Ficus will become a wholly owned subsidiary of me. He'll soon be available for bachelor parties and bar mitzvahs; I'll keep you posted.


Be Engineering Insights: Kernel Engineer Breaches Software's Iron Curtain and Lives to Tell About It

By Ficus Kirkpatrick

I don't often poke my head out above the Iron Curtain that divides user and kernel space, but recently, I decided to wander over, see the sights, and visit the people.

Benoît had written a series of programs that would display something cool on the screen. They were all derived from the same skeleton: constantly iterate over the frame buffer, and calculate the value of each pixel based on its location. The only variation between these programs was the expression in the inner loop.

I experimented with making my own programs like this for a while, but got tired of constantly recompiling. "It would be excellent," I mused, "if I could just type the expression in a window." I came up with a solution that should simultaneously galvanize and repulse true believers of the Holy Grail of software engineering: code reuse. Employ what is already the best expression parser and optimizer in the system: the C compiler!

ftp://ftp.be.com/pub/samples/game_kit/Whack.zip

Whack does lots of interesting things, but the highlight is that it generates its own drawing code and loads it on the fly. Whack also makes use of BDirectWindow, BRoster's running app list update mechanism, and the B_SIMPLE_DATA Drag-and-Drop protocol. It does lots of multithreading and everything else a good Be application should do.

The user interface is simple. There's a menu bar at the top of the window, an area in the middle for viewing the plotting of your expression, and an area at the bottom to enter it. You can save add-ons, drop them into the window, and add the ones you like to a list of favorites.

In the directory Whack runs from, there's a file called template that contains the code for a BDirectWindow drawing loop, but with __EXPRESSION__ in place of the real one. All Whack has to do is insert a preprocessor directive indicating what the expression should actually be and compile it.

What makes it look cool is that the variables you use in your expression change depending on the position of the pixel it is being evaluated for, time, frame count, etc. The variables available to use are:

x   The screen-relative X coordinate of the current pixel.
y   The screen-relative Y coordinate of the current pixel.
ix  The window-relative X coordinate of the current pixel.
iy  The window-relative Y coordinate of the current pixel.
f   A number incremented once per frame.
t   Time (actually, a number incremented once per pixel).

One of the really interesting effects you can achieve is by using the screen-based and window-based position variables together and then moving the window around. Give it a try.

There are a few things missing from Whack. I didn't have time to implement support for anything other than 32 bits per pixel, but the space is there for you to fill out, if you're so inclined. You just need to convert the value generated by the expression from 32-bit RGBA to your desired bit depth.

So, what's the alpha channel used for? The extremely simple drawing loop doesn't do any blending. Somehow it didn't seem right to be taking that extra byte of memory and not making any use of it. You may have heard of QuickPaint, the industry standard for paint programs that is the subject of George's Newsletter articles. I decided to apply to the QuickPaint Developer Program and see how I could integrate that extra unused byte from Whack into QuickPaint's new alpha-blending feature.

Try running both programs at once, and notice that you can send the current frame of Whack to QuickPaint as a new layer. "QuickPaint and Whack have achieved a level of synergy between two applications unmatched on any platform," said George Hoffman, CEO of QuickPaint, Inc.

Unfortunately, it requires a little work on the part of the expression writer to make the Whack frame show up correctly in QuickPaint. Take, for example, a boring expression like x+y+f, and try sending it to QuickPaint. The problem is that an alpha value of zero means no opacity. Now try (x+y+f)&0xff000000. If you're really clever, you can probably figure out how to achieve partial opacity.

I feel somehow wiser after this venture into the world of writing user programs, but I think I'll head back over the fence again.

See you next time!


Developers' Workshop: The Cow Piano

By Mikol Ryon

A journalist friend of mine (staked out in Washington—you want stories?) wrote to me recently to ask if I had heard of a time variant CD player. He wants to play his CDs at a slower rate, but without changing the pitch. Can such things be? With a computer, sure. But as a black box, I doubt it. If anyone knows better, send me a note; the fourth estate is waiting. Moreover, even the computer solution as it's implemented by commercial sound editors isn't really good enough, because while a computer can listen, it can't hear. My response, which doubles as a gauntlet tossed to the charging Turks, follows.

Tom,

Although it's certainly possible to shift time without changing pitch (or, conversely, to shift pitch without changing time—the general solution is the same), there is a problem of quality. The easy solution (typically involving an FFT, a shift, and an inverse FFT) takes the original sound and shifts the entire thing without regard for the sound itself. The more correct solution, but one that is very hard to apply generally, tries to "understand" the sound and then modulate the amount of the shift accordingly. An example is called for: Speak this sentence normally. Now speak it slowly. The difference is like this:

Normal: Speak this sentence... Slow: Speeeeeeeak thiiiiiiiis seeeeenteeeeeence....

The difference between a fast talker and a slow talker is heard in the vowels, not in the consonants. Think of the difference between a New Yorker and a Georgian (or, better, a Hollywood Georgian—or a Hollywood British Georgian such as Vivian Leigh or Leslie Howard). The Georgian doesn't stretch the entire sentence, he "sings" the vowels. The easy (and low quality) time shifter shifts the entire sound equally, thus:

Ththththiiiinnnnnkkkk ooooffff ththththiiiissss.....

Even if you were to hold the pitch steady, your brain would hear the sentence as having been mangled: It would become somewhat unintelligible. This is because almost all spoken information is encoded in the consonants.

Try to say this (whispering is your best bet, although the voiced "n"s are a bit of a problem): "thnkfthssntnc." You should still be able to understand what's being said. Now try voicing just the vowels (and we'll throw the "n"s in here as well to be fair): "inoienen". No information at all -- as speech, it's entirely meaningless.

The point, here, is that when you play with the consonants in speech, you're messing around with the part of the sound that carries almost all of the (factual) information. (Vowels *do* carry emotion, but that's a different sort of information.) Certain phonemes, such as "s", the leading "t", and the final "ck" can survive a time shift reasonably well. But others are transformed, and some, such as almost all instances of "d" and some of "p", seem to disappear.

The problem in music isn't as bad, since music "information" is primarily "vowels". But you can imagine the havoc that shifting time would visit on, say, a drum solo. When you slow down drums, you want to increase the space between attacks, you don't want to actually slow down the drum sound itself. Similarly for piano: If you slow down a piano's natural damping trajectory (the rate at which the note dies away), you lose the "pianoness" of the sound. Going the other way is worse. Speed up a recording of a piano (without changing the pitch) and you'll quickly turn your baby grand into a toy, and then into some bastard glockenspiel that makes my teeth ache.

Shifting pitch while holding time constant has its own set of problems, but they're difficult to characterize. When you shift pitch, you alter timbre. The sound doesn't become unintelligible, but it does change in ways that you might not anticipate. For example, if you shift the pitch of the spoken word, you'll have a hard time identifying the speaker (modulo distinctive vocal tics), not because of the pitch change, but because of the change in timbre. (Interestingly, shifting the pitch of the spoken word doesn't entirely obscure the sex of the speaker, because sex is encoded in the way the individual resonances move.) In theory, shifting pitch while maintaining time doesn't affect the consonants, so you'll still be able to understand what's being said -- but even the theory breaks down if you shift too far. What's "too far"? It depends on a lot of things, but shifting inside a (musical) third is okay, a fifth is marginal, an octave is far too much.

Now for the smart (time-shifting) solution: Here we try to figure out which part of the signal is noise (drums, instrument noise, consonants) and which part is "music" (pitches, vowels). In a recording of a single speaker, or a single instrument playing a single line, the detection is pretty easy. But if you have more than one voice, it's nearly impossible. If you can separate noise from music, you can shift the part that you want: Slow down the vowels but not the consonants. You can even mix and match: One of the most ear-boggling audio demos I've ever heard is of a piano that sounded like a cow...

While working at Stanford, a fellow named Xavier Serra developed software that could separate noise from music given a single voice source. As a demo, he took a recording of a piano and removed all musical information, leaving just the sound of the piano action (which is a significant part of a piano's "message"). He replaced the piano "music" with a recording of a herd of cows.

The effect was astonishing—it really sounded like somebody was playing a piano that had cows instead of strings. And not in the "Jingle Dog" style where a song is "sung" by pitched recordings of animal noises. The piano/cow was like nothing I'd ever heard—and then he played the piano/rain, and the rain/piano, and the rain/man, and so on. Interestingly, when he played the piano "music" by itself -- in other words, the piano recording with all the noise removed—the effect was sickening. The pitches were all there, but it was lumpishly cold. The recording of the piano action was vastly more musical.

Sincerely, etc.

As far as I know, there's been no breakthrough in polyphonic detection and analysis in the last ten years. But you'll correct me if I'm wrong. Better, write an app that merges animals and instruments. A cow piano is impressive...but just think of the duckulele.

Creative Commons License
Legal Notice
This work is licensed under a Creative Commons Attribution-Non commercial-No Derivative Works 3.0 License.