I came to BeOS programming from the Byzantine world of Windows 95. I've spent numerous hours fitting together different pieces of old and new technologies: DOS drivers, 16- and 32-bit Windows DLLs, VxDs, WDM drivers, putting square pegs in round holes. Sometimes it was fun. But system and device driver programming on BeOS is so easy that sometimes I get bored, so I decided to use my BeOS experience and hack a non-conventional device driver. What does this driver do? It doubles graphics speed on your Pentium Pro or Pentium II!
How is this possible? Pentium Pro and Pentium II have internal registers to set up caching types for different regions of memory. Usually, frame buffer memory on a graphics board uses a default caching type which is uncacheable. So, for example, when the CPU copies a bitmap from RAM to the frame buffer, each 4 bytes of data require a separate transaction on the PCI bus. This is costly: actual data transfer gets only 20-40% of the bus time, and the PCI does unproductive things with the remaining time:
arbitration address phase 4 bytes transfer idle arbitration (many PCI clocks) address phase (1 PCI clock) 4 bytes transfer (1 PCI clock) idle
Etc...
The Pentium Pro and Pentium II processors can accumulate written bytes in internal write buffers if the memory type for the written addresses is set up to Write Combining (WC) type. In this case the PCI transfers 32 bytes per transaction:
arbitration address phase 4 bytes transfer (1 PCI clock) 4 bytes transfer (1 PCI clock) 4 bytes transfer (1 PCI clock) 4 bytes transfer (1 PCI clock) 4 bytes transfer (1 PCI clock) 4 bytes transfer (1 PCI clock) 4 bytes transfer (1 PCI clock) 4 bytes transfer (1 PCI clock) idle
Much faster!
In general such functionality has to be a part of the OS kernel rather than driver-based. And in BeOS R4 it *will* be integrated into the kernel. Meanwhile, implementing it without BeOS support is a difficult hack (the politically correct name for an unconventional technique).
Here are the main problems with hacking this driver:
The registers have to be synchronously set up on all CPUs in the system. All interrupts must be disabled. To do that you have to designate a thread per CPU, which the BeOS does not support (i.e., CPU affinity for threads).
Move operations to/from x86 control registers that control caching, paging, etc., are not supported by the Metrowerks inline assembler. In addition, code manually assembled in the data segment somehow has to get the correct (code segment not the data segment!) PR relocations.
Kernel threads do not have access to a user memory space.
General paranoia when you touch control and CPU model-specific registers. Do not change reserved bits!
So—how to resolve these challenges?
Spawn a real-time priority kernel thread per CPU, then block and simultaneously release them. Use spinlocks to synchronize threads with interrupts disabled.
Use opcodes from x86 databooks. First load the address of the assembled code into a pointer, then use this pointer to call the code. This way the linker will generate correct PE relocations.
Use global to the driver data.
Be careful.
Here is the heavily commented source code for this driver and the application to talk to it. All the more or less obscure elements are explained. The code lacks full error checking and recovery and has one gross violation of safe driver conduct (see if you can spot it). Nevertheless, it's pretty safe for the current Intel PCs and the BeOS Release 3.
For additional reference, see the "Intel Architecture Software Developer's Manual, Volume 3: System Programming Guide," Chapter 9.11 for details on Memory Type Range Registers: http://www.intel.com/design/pentiumii/manuals/243192.htm
/********** mtrr_drv.h ****************/ #ifndef MTRR_DRV_H #define MTRR_DRV_H #include <Drivers.h> #ifdef __cplusplus extern "C" { #endif typedef enum {UC_MEM
= 0, /* uncacheable */WC_MEM
= 1, /* write combining */WT_MEM
= 4, /* write-through */WP_MEM
= 5, /* write-protected */WB_MEM
= 6 /* writeback */ } MEM_TYPE; typedef enum {RANGE_CLEAR
,RANGE_SET
} RANGE_ACTION; typedef struct { RANGE_ACTION action; /* clear/set */ uint64base
; uint64len
; MEM_TYPEmem_type
; } mtrr_drv_command_t; /* len has to be a power of 2 and base has to be aligned on at least len boundary. The minimum len is 4 KByte. */ enum {MTTR_DRV_COMMAND
=B_DEVICE_OP_CODES_END
+ 1 }; #ifdef __cplusplus } #endif #endif /*********** mtrr_drv.c *****************/ #include <KernelExport.h> #include <Drivers.h> #include <Errors.h> #include <OS.h> #include <perfmon_kernel.h> #include "mtrr_drv.h" intn_cpus
; sem_idmtrr_drv_lock_sem
; sem_idmain_sem
; /* acquired by the main thread, released by the work threads */ sem_idwork_sem
; /* acquired by the work threads, released by the main thread */ volatile longwork_begin_spinlock
= 0,work_end_spinlock
= 0; /* for work threads synchronization with disabled interrupts */ volatile longn_errors
; /* to count work threads errors */ mtrr_drv_command_tkernel_mtrr_command
; int32work_thread
(void*mtrr_command
); static status_tset_memory_type
(mtrr_drv_command_t*mtrr_command
); static status_tset_memory_range_type
(constmtrr_drv_command_t*mtrr_command
); static status_tset_mtrr
(int64base
, uint64len
, MEM_TYPEmem_type
, uint64mtrr0
, intnum_mtrrs
); static status_tclear_mtrr
(int64base
, uint64mtrr0
, intnum_mtrrs
); static status_tlock_mtrr_drv
(void) { returnacquire_sem
(mtrr_drv_lock_sem
); } static status_tunlock_mtrr_drv
(void) { returnrelease_sem
(mtrr_drv_lock_sem
); } /* Only Pentium Pro and Pentium II have memory range type registers. Tell the kernel to load the driver only on a system with the right CPU. The driver is very conservative and does not do anything remotely dangerous, like read CPU model-specific registers in init_hardware(). A user has to explicitly load/open/read/ioctl this driver to touch the hardware. */ status_tinit_hardware
(void) { system_infosys_info
;get_system_info
(&sys_info
); switch (sys_info
.cpu_type
) { caseB_CPU_INTEL_PENTIUM_PRO
: caseB_CPU_INTEL_PENTIUM_II_MODEL_3
: caseB_CPU_INTEL_PENTIUM_II_MODEL_5
: returnB_OK
; default: returnB_ERROR
; } } status_tinit_driver
(void) { if((mtrr_drv_lock_sem
=create_sem
(1, "mtrr_drv_lock")) <B_OK
) returnB_ERROR
; returnB_OK
; } voiduninit_driver
(void) {delete_sem
(mtrr_drv_lock_sem
); } static status_tmtrr_drv_open
(const char *name
, uint32flags
, void**cookie
) { returnB_OK
; } static status_tmtrr_drv_read
(void*cookie
, off_tposition
, void *buf
, size_t*num_bytes
) { *num_bytes
= 0; /* tell caller nothing was read */ returnB_IO_ERROR
; } static status_tmtrr_drv_write
(void*cookie
, off_tposition
, const void*buffer
, size_t*num_bytes
) { *num_bytes
= 0; /* tell caller nothing was written */ returnB_IO_ERROR
; } static status_tmtrr_drv_control
(void*cookie
, uint32ioctl
, void*arg
, size_tlen
) { status_tres
;lock_mtrr_drv
(); switch(ioctl
) { caseMTTR_DRV_COMMAND
:res
=set_memory_type
((mtrr_drv_command_t*)arg
); break; default:res
=B_BAD_VALUE
; }unlock_mtrr_drv
(); returnres
; } static status_tmtrr_drv_close
(void*cookie
) { returnB_OK
; } static status_tmtrr_drv_free
(void*cookie
) { returnB_OK
; } static const char *mtrr_drv_name
[] = { "cpu/mtrr", NULL }; device_hooksmtrr_drv_hooks
= {mtrr_drv_open
,mtrr_drv_close
,mtrr_drv_free
,mtrr_drv_control
,mtrr_drv_read
,mtrr_drv_write
}; const char**publish_devices
() { returnmtrr_drv_name
; } device_hooks*find_device
(const char* name) { return &mtrr_drv_hooks
; } /* Access to x86 control registers */ /* Metrowerks inline assembler does not support move to/from x86 registers so assemble them manually. */ uint8mov_cr0_eax__ret
[] = { 0x0F, 0x22, 0xC0, /* mov cr0, eax */ 0xC3 /* ret */ }; uint8mov_cr3_eax__ret
[] = { 0x0F, 0x22, 0xD8, /* mov cr3, eax */ 0xC3 /* ret */ }; uint8mov_eax_cr0__ret
[] = { 0x0F, 0x20, 0xC0, /* mov eax, cr0 */ 0xC3 /* ret */ }; uint8mov_eax_cr3__ret
[] = { 0x0F, 0x20, 0xD8, /* mov eax, cr3 */ 0xC3 /* ret */ }; static voidset_cr0
(uint32val
) { /* Generate the correct relocation record for code in the data segment. Simple "call mov_cr0_eax__ret" will be patched as if mov_cr0_eax__ret were in the code segment. */ void*f_ptr
=mov_cr0_eax__ret
; asm { mov eax,val
callf_ptr
} } static uint32get_cr0
(void) { void*f_ptr
=mov_eax_cr0__ret
; uint32res
= 0; asm { callf_ptr
movres
, eax /* play it safe */ } returnres
; } static voidset_cr3
(uint32val
) { void*f_ptr
=mov_cr0_eax__ret
; asm { mov eax,val
callf_ptr
} } static uint32get_cr3
(void) { void*f_ptr
=mov_eax_cr0__ret
; uint32res
= 0; asm { callf_ptr
movres
, eax } returnres
; } static uint32get_cpu_features
(void) { uint32res
= 0; asm { pusha mov eax, 1 cpuid movres
, edx popa } returnres
; } /* roundup the arg to the nearest 2 to the nth power */ static uint64roundup
(uint64val
) { inti
; uint64roundup_val
; uint64tmp_val
; for(i
=0,tmp_val
=val
- 1;tmp_val
!= 0;tmp_val
>>= 1,i
++) ;roundup_val
= 1 <<i
; returnroundup_val
; } static boolis_aligned
(uint64addr
, uint64alignment
) { do {alignment
>>= 1; if(addr
&alignment
) returnFALSE
; } while(alignment
!= 0); returnTRUE
; } static status_tset_memory_type
(mtrr_drv_command_t*mtrr_command
) { system_infosys_info
; physical_entryphys_mem_start
; inti
; /*dprintf
("%d %d %Lx %Lx\n", mtrr_command->action, mtrr_command->mem_type, mtrr_command->base, mtrr_command->len); */ /* do arguments checking and conditioning */ /* allow only supported memory types */ switch(mtrr_command
->mem_type
) { caseUC_MEM
: /* uncachable */ caseWC_MEM
: /* write combining */ caseWT_MEM
: /* write-through */ caseWP_MEM
: /* write-protected */ caseWB_MEM
: /* writeback */ break; default: returnB_BAD_VALUE
; } /* len has to be a power of 2 and base has to be aligned on at least len boundary. The minimum len is 4 KByte. */ if(mtrr_command
->action
==RANGE_SET
) {mtrr_command
->len
=roundup
(mtrr_command
->len
); if(mtrr_command
->len
< 4096) returnB_BAD_VALUE
; } /* The driver gets virtual base address but mtrrs need physical address. Convert virtual to physical. In error case assume that physical address was passed. */get_memory_map
((const void*)mtrr_command
->base
, 4096, &phys_mem_start
, 1); if(phys_mem_start
.size
>= 4096) /* this virtual address is mapped */mtrr_command
->base
= (uint64)phys_mem_start
.address
; if(mtrr_command
->action
==RANGE_SET
) { if(!is_aligned
(mtrr_command
->base
,mtrr_command
->len
)) returnB_BAD_VALUE
; }get_system_info
(&sys_info
);n_cpus
=sys_info
.cpu_count
; /* create a semaphore for the main thread to wait for all work threads to start/finish */main_sem
=create_sem
(0, "mtrr_drv_main"); /* create a semaphore for work threads to wait to start simultaneously */work_sem
=create_sem
(0, "mtrr_drv_work"); /* copy data from the user space to the kernel space: kernel threads do not have access to the user space. */memcpy
(&kernel_mtrr_command
,mtrr_command
,sizeof
(kernel_mtrr_command
)); /* spawn a kernel thread for each cpu in the system */ for(i
=0;i
<n_cpus
;i
++) { charthread_name
[] = "mtrr_x";thread_name
[5]=i
+'0';resume_thread
(spawn_kernel_thread
(work_thread
,thread_name
,B_REAL_TIME_PRIORITY
, (void*)&kernel_mtrr_command
)); } /* reset the work threads error counter */n_errors
= 0; /* wait for all work threads to start */acquire_sem_etc
(main_sem
,n_cpus
, 0, 0); /* release all work threads together. Big Bang!*/release_sem_etc
(work_sem
,n_cpus
, 0); /* All work threads are running on all CPUs in the system */ /* wait until all work threads have done the job */acquire_sem_etc
(main_sem
,n_cpus
, 0, 0); /* give some time to work threads to die */snooze
(10*1000);dprintf
("n_errors = %d\n",n_errors
); /* Done! */ /* do not leak semaphores */delete_sem
(work_sem
);delete_sem
(main_sem
); return (n_errors
== 0) ?B_OK
:B_ERROR
; } int32work_thread
(void*arg
) { cpu_statusps
; int32res
; constmtrr_drv_command_t*mtrr_command
= (constmtrr_drv_command_t*)arg
; /* tell the main thread that this thread started */release_sem
(main_sem
); /* wait the beginning of the Big Bang */acquire_sem
(work_sem
); /* Each work thread is running on its own CPU. Disable scheduling and thus lock work threads on CPUs */ps
=disable_interrupts
(); /* Yes! This thread owns the current CPU until restore_interrupts() or any scheduling calls (acquire/release_sem(), etc.). Start multiple-processor MTTR change protocol. See Intel Architecture Software Developer's Manual, Volume 3: System Programming Guide, chapter 9.11.8 for details. http://www.intel.com/design/pentiumii/manuals/243192.htm */ /* Busy wait for other CPUs/work threads to reach this point. A thread can not call acquire_sem() or any potentially blocking calls if interrupts are disabled. */ /* tell other work threads: "I am here." */atomic_add
(&work_begin_spinlock
, 1); /* Busy wait. work_spinlock is volatile so the compiler will not optimize out this loop */ while(work_begin_spinlock
!=n_cpus
) ; /* Do the real work */res
=set_memory_range_type
(mtrr_command
); /* tell other work threads: "I am here." */atomic_add
(&work_end_spinlock
, 1); /* Busy wait. */ while(work_end_spinlock
!=n_cpus
) ; /* return to the normal mode */restore_interrupts
(ps
); /* update error counter */ if(res
!=B_OK
)atomic_add
(&n_errors
, 1); /* Tell the main tread: "I am done!" */release_sem
(main_sem
); returnres
; } /* ---------- set_mtrrs - scans the variable MTRRs to find already programmed MTRR. If such MTTRs is not found, find the first unused and program it to use mem_type for memory with starting address base and length len. len has to be a power of 2 and base has to be aligned on at least len boundary. The minimum len is 4 KByte. Returns B_OK if the MTRR is set or already has been set. Returns B_ERROR if all MTRRs are in use ----- */ static status_tset_mtrr
(int64base
, uint64len
, MEM_TYPEmem_type
, uint64mtrr0
, intnum_mtrrs
) { uint64mask
; uint64mtrr
;mask
=len
- 1;mask
= ~mask
;mask
&= 0x0000000FFFFFF000ULL;mask
|= (1 << 11); /* set enable bit */base
= (base
& 0xFFFFFF000ULL) |mem_type
; for(mtrr
=mtrr0
;mtrr
<mtrr0
+num_mtrrs
*2;mtrr
+= 2) { if(read_msr
(mtrr
) ==base
) { if(read_msr
(mtrr
+ 1) ==mask
) /* BIOS already has set this memory range with our type */ { returnB_OK
; } else /* BIOS setup is broken, reprogram this register */ {write_msr
(mtrr
,base
);write_msr
(mtrr
+1,mask
); returnB_OK
; } } } /* find an unused MTRR */ for(mtrr
=mtrr0
;mtrr
<mtrr0
+num_mtrrs
*2;mtrr
+= 2) { if(!(read_msr
(mtrr
+1) & (1 << 11) )) /* the register is disabled, we can use it! */ break; } if(mtrr
== (mtrr0
+num_mtrrs
*2)) /* all regs are in use, we can't do anything */ returnB_ERROR
;write_msr
(mtrr
,base
);write_msr
(mtrr
+1,mask
); returnB_OK
; } static status_tclear_mtrr
(int64base
, uint64mtrr0
, intnum_mtrrs
) { uint64mtrr
;base
=base
& 0xFFFFFF000ULL; for(mtrr
=mtrr0
;mtrr
<mtrr0
+num_mtrrs
*2;mtrr
+= 2) { if((read_msr
(mtrr
) & 0xFFFFFF000ULL ) == base) {write_msr
(mtrr
+1, 0); /* clear enable bir */write_msr
(mtrr
, 0); /* clear base reg, just for completeness */ returnB_OK
; } } /* Generally it is not an error: such range does not exist so it is "disabled" */ returnB_ERROR
; } static status_tset_memory_range_type
(constmtrr_drv_command_t*mtrr_command
) { uint32cr0
,cr0_old
; uint64default_mtrr_type
,mtrr_cap
; boolwc_supported
; intnum_mtrrs
; status_tres
; inti
; if(!(get_cpu_features
() & (1 << 12) ) ) /* MTRRs are not supported */ returnB_ERROR
; /* Does the CPU support write combining memory ? */mtrr_cap
=read_msr
(0xFE);wc_supported
= ((mtrr_cap
& (1 << 10)) != 0 ); if((!wc_supported
) && (mtrr_command
->mem_type
==WC_MEM
) && (mtrr_command
->action
==RANGE_SET
)) returnB_ERROR
; /* How many mtrrs does the CPU have? */num_mtrrs
=mtrr_cap
& 0xFF; /* Get the default memory type and enable flags for fixed and variable range mtrrs */default_mtrr_type
=read_msr
(0x2FF); /* enter no-fill cache mode */cr0
=cr0_old
= get_cr0();cr0
|= (1 << 30);cr0
&= ~(1 << 29);set_cr0
(cr0
); asm { wbinvd } /* flush and invalidate cache */set_cr3
(get_cr3
()); /* flush TLBs */write_msr
(0x2FF, 0); /* disable all MTRRs */ /* do actual mttr change */ switch(mtrr_command
->action
) { caseRANGE_SET
:res
=set_mtrr
(mtrr_command
->base
,mtrr_command
->len
,mtrr_command
->mem_type
, 0x200,num_mtrrs
); break; caseRANGE_CLEAR
: clear_mtrr(mtrr_command
->base
, 0x200,num_mtrrs
);res
=B_OK
; break; }write_msr
(0x2FF, default_mtrr_type | (1 << 11) ); /* enable variable-range MTRRs*/ asm { wbinvd } /* flush and invalidate cache */set_cr3
(get_cr3
()); /* flush TLBs */set_cr0
(cr0_old
); /* restore caching */ returnres
; } /******** fastvid.c **********/ #include <OS.h> #include <PCI.h> #include <stdlib.h> #include <stdio.h> #include <unistd.h> /* ioctl() */ #include <fcntl.h> /* open()/close() */ #include <errno.h> #include <string.h> /* strerror() */ #include "mtrr_drv.h" voidprint_usage
(void) {printf
("fastvid enables/disables write combining for a graphic frame buffer.\n");printf
("Usage:\n");printf
(" fastvid set|clear\n");printf
(" fastvid set mem_base mem_len mem_type\n");printf
(" fastvid clear mem_base\n");exit
(0); } voidfind_frame_buffer
(mtrr_drv_command_t*info
) { inti
; pci_infoh
; longn
; for (n
=0;get_nth_pci_info
(n
, &h
) ==B_OK
;n
++) { if (h
.class_base
!=PCI_display
&& !(h
.class_base
==PCI_early
&&h
.class_sub
==PCI_early_vga
) ) continue; for (i
= 0;i
< 6;i
++) { if (h
.u
.h0
.base_register_flags
[i
] &PCI_address_space
) continue; if (h
.u
.h0
.base_register_sizes
[i
] < (1024*1024) ) continue;info
->len
=h
.u
.h0
.base_register_sizes
[i
];info
->base
=h
.u
.h0
.base_registers
[i
]; if(info
->len
> 8*1024*1024) /* limit the size of the guessed frame buffer */info
->len
= 8*1024*1024; return; /* B_OK; */ } }printf
("Can not find frame buffer\n");exit
(2); /* info->len = info->base = 0; */ /*return B_ERROR;*/ } intmain
(intargc
, char*argv
[]) { mtrr_drv_command_tcommand
; intfd
; status_tret
; if(argc
<2)print_usage
(); if(strcmp
(argv
[1], "set") == 0) {command
.action
=RANGE_SET
; switch(argc
) { case 2:find_frame_buffer
(&command
);command
.mem_type
=WC_MEM
; break; case 5:command
.base
= strtoull(argv
[2],NULL
, 0);command
.len
= strtoull(argv
[3],NULL
, 0);command
.mem_type
= (MEM_TYPE)strtoul(argv
[4],NULL
, 0); break; default:print_usage
(); } } else if(strcmp
(argv
[1], "clear") == 0) {command
.action
=RANGE_CLEAR
; switch(argc
) { case 2:command
.action
=RANGE_CLEAR
;find_frame_buffer
(&command
); break; case 3:command
.base
=strtoull
(argv
[2], NULL, 0); break; default:print_usage
(); } } else {print_usage
(); } /*printf
("%d %d %Lx %Lx\n",command
.action
,command
.mem_type
,command
.base
,command
.len
); */errno
= 0;fd
=open
( "/dev/cpu/mtrr",O_RDWR
); if(fd
== -1) {printf
("Can't open /dev/cpu/mtrr, errno=%d (%s)\n",errno
, strerror(errno
));exit
(1); }ret
=ioctl
(fd
,MTTR_DRV_COMMAND
, &command
,sizeof
(command
)); if(ret
==B_OK
) {printf
("Success!\n"); } else {printf
("Error: ioctl returned 0x%x, errno=%d (%s)\n",ret
,errno
,strerror
(errno
));exit
(1); } return 0; }
Good parameters for the fastvid app:
fastvid set (enable write-combining for the frame buffer) fastvid clear (disable write-combining for the frame buffer) fastvid set 0 RAM_size 0 (disable caching for the whole RAM)
The results: On my dual Pentium II 300 MHz 440FX system with PCI Matrox Millenium 1, Life demo without write-combining gets 23 megacells/second, with write-combining 44 megacells/second. Life is a good app to demonstrate the effects of this driver, because it is both CPU and PCI bandwidth-limited. The results for your favorite app may vary.
I'm glad you asked. Since informal surveys indicate that *maybe* only one out of four or five people ever looks at a user guide, and most of them do it for the satisfaction of finding typos, why have user docs?
In the case of the BeOS, there are user docs in anticipation of the day when there will be non-geek users. Geeks are assumed to know all there is to know about a user interface, and not to care anyway since anyone who doesn't work from the command line probably shares a place in the food chain next to krill.
Non-geeks, on the other hand, may welcome step-by-step explanations of how to configure network preferences—never a particularly intuitive process. Or how to make a prehistoric one-button mouse behave like a three-button mouse, and why you'd want to do that. And what is this thing called Tracker, and how does it work?
If you're coming to the BeOS from Mac or Windows, some operations, features, and settings will be familiar, but others will not. The Desktop has a certain quaint inevitability about it, which is either comforting or boring, depending on your point of view. The undeniable advantage, though, is that you don't need a map and compass (or even a user guide) to figure it out. You can jump right in, start opening windows, moving them around, and doing things in them.
But—what if you want to learn about some of the refinements—like how to navigate between windows from the window itself? What you need to know is in the BeOS User's Guide, in "Navigating in Tracker Windows http://www.be.com/documentation/user_docs/01_beos_basics.html
This section explains the hierarchical organization of folders and how to move between them, opening or closing parent or child folders as you go. I also like this feature because it shows me at a glance where a folder lives, without the extra MacStep of doing a Find.
Another feature the BeOS User's Guide can tell you a thing or two about is the Workspaces app. Here's what Scott Patterson, BeOS Demo God, says about Workspaces: "I can organize my work into different workspaces, with productivity applications in one (e-mail, telnet, web browser), specialized apps in another (video/audio editing, page layout), and alternative environments in a third (the awesome SheepShaver Macintosh environment, for example). I can also preview graphics more easily than on any other platform in the world. I can edit a graphic image in millions of colors and just by dragging it across to an 8-bit workspace, I know what it's going to look like across the Internet on a machine that only supports 256 colors."
Want to know more about setting up Workspaces? RTFM:
http://www.be.com/documentation/user_docs/05_beos_customize.html
and
http://www.be.com/documentation/user_docs/05_beos_customize.html
And then there are Replicants—a nifty concept in search of a raison d'etre. The BeOS User's Guide tells you what Replicants are and how to create them
http://www.be.com/documentation/user_docs/01_beos_basics.html
Even more important, once you have a clock or NetPositive Replicant sitting on your desktop, how do you get rid of it? The same section that tells you how to create a Replicant tells you how to delete it.
User guides are often frustrating because they don't tell you what you need to know. (Software is often frustrating because it doesn't let you do what you want, or makes you go around the block three times first -- but that's another story.) We don't want to add to the global computer user frustration quotient by overlooking information you need to run the BeOS at a gallop, or by including it but in a form that's too opaque to understand.
If you read the BeOS User's Guide, and it lets you down, share your pain. We can always try to make it better the next time around: mailto:linda@be.com.
On my way home last night, a figure stepped out of the shadows and rhythmically rapped into my ear:
Question: Why is it that every time my device driver runs, my system hangs? Fuuuunnnk daaaat!
Development behind the 2 GB iron curtain is often difficult, so the kernel provides several functions to help you track down bugs in your device driver or kernel add-on. You can follow along at home in the "Exported Kernel Functions" section of the "Device Drivers" chapter in the online Be Book.
The most primitive debugging function is d
, which squirts
formatted text through the serial port, much like the printf
()SERIAL_PRINT()
macro of Support Kit fame. Unlike SERIAL_PRINT()
,
d
is accessible
from kernel space. As always, serial communication occurs through
printf
()/dev/serial1
on x86,
/dev/serial4
on the BeBox, and
/dev/modem
on the Mac
with data parameters 19200 N81
.
No doubt you've been meaning to dust off your VT52 anyway.
If nothing comes out of the serial port, serial output is probably disabled. There are a few ways of turning it on, including:
Holding down the delete key on Macs or the F1 key on x86 or BeBox machines during bootup.
Calling set_d
, as detailed in the Be Book.
printf
_enabled(true)
Once you've established a serial debugging connection, you may find
yourself a frequent guest of the kernel debugger. The kernel debugger is
typically triggered by an exception in kernel space. You can also
programmatically enter it with kernel_debugger()
.
The kernel debugger, for the most part, presents a read-only snapshot of the universe. This tool possesses limited poking abilities, but there is currently no way of modifying register values and breakpoints short of wading through stack frames and writing over code. Typing "help" will give you a list of the debugger's capabilities. Don't bother to RTFM; that's all the M that's available for now (this will be rectified in the future).
The debugger understands symbols found in xMAP
files, which is helpful in
deciphering stack traces. Every aspiring kernel driver writer should copy
the kernel's xMAP
from the installation CD to
/system
. The kernel can be
instructed to load your driver's symbols with load_driver_symbols()
. This
function searches for a specific xMAP
in the
drivers
,
file_systems
,
pnp
,
and cam
subdirectories of the kernel add-ons directories.
In large functions, it's often difficult to identify the precise source
line triggering the exception. Fortunately, the -g
and -machinecodelist
mwcc
options working in tandem can provide you with an enlightening
interleaved assembly and source view of your code. Its contents will let
you locate the crash in no time at all.
Look for kernel debugger improvements in R4, including breakpoints, tracing, and improved register shadowing (how many times have you wanted to do 'dis eip' or 'esp += 8'?). These commands and more will help ease the burden of locating bugs and can be used to avert impending crashes through judicious stack and pc manipulation.
In fine Hiroshi fashion, here are three insightful yet unrelated techniques for faster driver development:
ASSERT()
Although there is no predefined ASSERT()
macro for kernel drivers,
there's nothing to prevent you from selfishly cobbling one together for
your personal use:
assert.h
:
#ifndef DEBUG #defineASSERT
(c
) 0 #else int_assert_
(char *,int,char *); #defineASSERT
(c
) (!(c
) ?_assert_
(__FILE__,__LINE__,#c) : 0) #endif
assert.c
:
#ifdef DEBUG int _assert_(char *a
, intb
, char *c
) {dprintf
("tripped assertion in %s/%d (%s)\n",a
,b
,c
);kernel_debugger
("tripped assertion"); return 0; } #endif
ioctl()
If the driver/add-on protocol includes an ioctl()
facility, use it as a
runtime debugging aid. For example, ioctls may be defined to print out
important data structures or verify data integrity. Here's a short
program you can modify to issue human-readable ioctl()
commands to your
driver.
#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> struct cmds { char *string
; intcode
; }commands
[] = { /* replace with your ioctls */ { "dumpinfo", 10000 }, { "verifyintegrity", 10001 }, { "simulateerror", 10002 }, { "reset", 10003 }, {NULL
, 0 } }; static voidprint_help
() { inti
;printf
("usage: ioctl command files...\n");printf
("commands: %s",commands
[0]); for (i
=1;commands
[i
].string
!=NULL
;i
++)printf
(", %s",commands
[i
]);printf
("\n");exit
(-1); } intmain
(intargc
, char **argv
) { intfd
,i
,code
; if (argc
< 3)print_help
(); for (i
=0;commands
[i
].string
;i
++) { if (!strcasecmp
(commands
[i
].string
,argv
[1])) {code
=commands
[i
].code
; break; } } if (commands
[i
].string
==NULL
) { for (i
=0;argv
[1][i
];i
++) if ((argv
[1][i
] < '0') || (argv
[1][i
] > '9')) break; if (argv
[1][i
])print_help
();code
=atoi
(argv
[1]); } for (i
=2;i
<argc
;i
++) { if ((fd
=open
(argv
[i
],O_RWMASK
)) < 0) {printf
("error opening %s (%s)\n",argv
[i
],strerror
(fd)); continue; }ioctl
(fd
,code
);close
(fd
); } return 0; }
add_debugger_command()
ioctl()
is nice and all, but you'll soon lust for some of that
post-mortem loving. Fortunately, the kernel debugger allows you to hook
in new commands with add_debugger_command()
. This function registers a
callback with the kernel debugger that is called with main()
-style
argc/argv arguments.
When writing a kernel debugger command, remember that it may be called
while the kernel is in an unpredictable state. This means malloc()
and
friends are off limits, as they may induce swapping. d
is also
a no-no; use the stripped down equivalent printf
()k
instead.
printf
()
Since added kernel debugger commands normally lie in the driver's memory space, the kernel debugger runs into problems involving accessing unallocated memory when the driver is unloaded. Drivers should therefore remove added commands with remove_debugger_command() (new for R3.1) when they are unloaded.
Want code?
intdo_echo
(intargc
, char **argv
) { inti
; if (argc
== 1) {kprintf
("echo <args> - prints arguments\n"); return 0; } for (i
=1;i
<argc
;i
++)kprintf
("%s\n",argv
[i
]); return 0; } status_tinit_driver
() {add_debugger_command
("echo", do_echo, "echo <args> - prints arguments); ... } status_tuninit_driver
() {remove_debugger_command
("echo", do_echo); ... }
It may not be as good an idea as showing off the most succulent applications BeOS developers are feverishly debugging, or writing, or thinking of. Or haven't thought of yet but will think of, write, debug, and demo by the time the show opens three weeks from now. But it's an idea nonetheless: an online guided tour of the Be community. If we get a good net connection, it might even be feasible. Let us pray to the gods of Unions and New York City Prices.
Why, you ask, should we take time away from A/V wizardry to click through a stack of Web pages? Especially since anybody can use a browser, but only a Demo God can demonstrate real- time WYSIWYG and multiple video capture windows. But a browser, commodity that it is, regulated or not, is uniquely capable of demonstrating the reality of a global Be community, bringing Be news to the masses as it happens.
For a preview of BeOS-ville as a PC Expo demo, you could start from any of several jumping off points. One is the Believe Finder, at http://www.napanet.net/~xredbear/finder.htm, a lively, personal, and fairly encyclopedic site, in the original meaning of encompassing all knowledge available. You'll find everything from FAQs on hardware requirements, to driver information, to humor, to Japan and Europe, to the whats and the whys of the BeOS.
I like the site's opinionated tone and I appreciate the tremendous effort involved in grouping and presenting so much information in such a useful way. But what does it have to do with PC Expo? For a start, Believe Finder treats visitors to a nice demo that shows off the best of BeOS audio and video. Only a minority of PC Expo visitors will have seen the BeOS before, or even heard about it. Believe Finder offers clear evidence that they might have been missing something.
But, after the glow of the demo wears off, questions arise: Who are these geeks, how real are they? That's why I think a tour of Be-dedicated Web sites at PC Expo might be a persuasive way to establish the existence of an active, diverse Be community worldwide.
And what about our own Web site? At http://www.be.com/usergroups/wwwlinks.html, you'll find a less encyclopedic but no less useful list of user group sites. Our tone is a bit more neutral, for obvious reasons, but you'll be vectored to a similarly lively hive of Be-related sites, often the same ones as in the Believe Finder. If you haven't done so already, take a quick tour of some of these sites. You'll see why I'd like PC Expo visitors to get at least a glimpse of our small world via the Web.