Pervasive Debugging =================== Alex Ho (alex.ho at cl.cam.ac.uk) Introduction ------------ The pervasive debugging project is leveraging Xen to debug distributed systems. We have added a gdb stub to Xen to allow for remote debugging of both Xen and guest operating systems. More information about the pervasive debugger is available at: http://www.cl.cam.ac.uk/netos/pdb Implementation -------------- The gdb stub communicates with gdb running over a serial line. The main entry point is pdb_handle_exception() which is invoked from: pdb_key_pressed() ('D' on the console) do_int3_exception() (interrupt 3: breakpoint exception) do_debug() (interrupt 1: debug exception) This accepts characters from the serial port and passes gdb commands to pdb_process_command() which implements the gdb stub interface. This file draws heavily from the kgdb project and sample gdbstub provided with gdb. The stub can examine registers, single step and continue, and read and write memory (in Xen, a domain, or a Linux process' address space). The debugger does not currently trace the current process, so all bets are off if context switch occurs in the domain. Setup ----- +-------+ telnet +-----------+ serial +-------+ | GDB |--------| nsplitd |--------| Xen | +-------+ +-----------+ +-------+ To run pdb, Xen must be appropriately configured and a suitable serial interface attached to the target machine. GDB and nsplitd can run on the same machine. Xen Configuration Add the "pdb=xxx" option to your Xen boot command line where xxx is one of the following values: com1 gdb stub should communicate on com1 com1H gdb stub should communicate on com1 (with high bit set) com2 gdb stub should communicate on com2 com2H gdb stub should communicate on com2 (with high bit set) Symbolic debugging infomration is quite helpful too: xeno.bk/xen/arch/i386/Rules.mk add -g to CFLAGS to compile Xen with symbols xeno.bk/xenolinux-2.4.24-sparse/arch/xen/Makefile add -g to CFLAGS to compile Linux with symbols You may also want to consider dedicating a register to the frame pointer (disable the -fomit-frame-pointer compile flag). When booting Xen and domain 0, look for the console text "Initializing pervasive debugger (PDB)" just before DOM0 starts up. Serial Port Configuration pdb expects to communicate with gdb using the serial port. Since this port is often shared with the machine's console output, pdb can discriminate its communication by setting the high bit of each byte. A new tool has been added to the source tree which splits the serial output from a remote machine into two streams: one stream (without the high bit) is the console and one stream (with the high bit stripped) is the pdb communication. See: xeno.bk/tools/nsplitd nsplitd configuration --------------------- hostname$ more /etc/xinetd.d/nsplit service nsplit1 { socket_type = stream protocol = tcp wait = no user = wanda server = /usr/sbin/in.nsplitd server_args = serial.cl.cam.ac.uk:wcons00 disable = no only_from = 128.232.0.0/17 127.0.0.1 } hostname$ egrep 'wcons00|nsplit1' /etc/services wcons00 9600/tcp # Wanda remote console nsplit1 12010/tcp # Nemesis console splitter ports. Note: nsplitd was originally written for the Nemesis project at Cambridge. After nsplitd accepts a connection on (12010 in the above example), it starts listening on port . Characters sent to the will have the high bit set and vice versa for characters received. You can connect to the nsplitd using 'tools/xenctl/lib/console_client.py ' GDB 6.0 pdb has been tested with gdb 6.0. It should also work with earlier versions. Usage ----- 1. Boot Xen and Linux 2. Interrupt Xen by pressing 'D' at the console You should see the console message: (XEN) pdb_handle_exception [0x88][0x101000:0xfc5e72ac] At this point Xen is frozen and the pdb stub is waiting for gdb commands on the serial line. 3. Attach with gdb (gdb) file xeno.bk/xen/xen Reading symbols from xeno.bk/xen/xen...done. (gdb) target remote : /* contact nsplitd */ Remote debugging using serial.srg:12131 continue_cpu_idle_loop () at current.h:10 warning: shared library handler failed to enable breakpoint (gdb) break __enter_scheduler Breakpoint 1 at 0xfc510a94: file schedule.c, line 330. (gdb) cont Continuing. Program received signal SIGTRAP, Trace/breakpoint trap. __enter_scheduler () at schedule.c:330 (gdb) step (gdb) step (gdb) print next /* the variable prev has been optimized away! */ $1 = (struct task_struct *) 0x0 (gdb) delete Delete all breakpoints? (y or n) y 4. You can add additional symbols to gdb (gdb) add-sym xenolinux-2.4.24/vmlinux add symbol table from file "xenolinux-2.4.24/vmlinux" at (y or n) y Reading symbols from xenolinux-2.4.24/vmlinux...done. (gdb) x/s cpu_vendor_names[0] 0xc01530d2 : "Intel" (gdb) break free_uid Breakpoint 2 at 0xc0012250 (gdb) cont Continuing. /* run a command in domain 0 */ Program received signal SIGTRAP, Trace/breakpoint trap. free_uid (up=0xbffff738) at user.c:77 (gdb) print *up $2 = {__count = {counter = 0}, processes = {counter = 135190120}, files = { counter = 0}, next = 0x395, pprev = 0xbffff878, uid = 134701041} (gdb) finish Run till exit from #0 free_uid (up=0xbffff738) at user.c:77 Program received signal SIGTRAP, Trace/breakpoint trap. release_task (p=0xc2da0000) at exit.c:51 (gdb) print *p $3 = {state = 4, flags = 4, sigpending = 0, addr_limit = {seg = 3221225472}, exec_domain = 0xc016a040, need_resched = 0, ptrace = 0, lock_depth = -1, counter = 1, nice = 0, policy = 0, mm = 0x0, processor = 0, cpus_runnable = 1, cpus_allowed = 4294967295, run_list = {next = 0x0, prev = 0x0}, sleep_time = 18995, next_task = 0xc017c000, prev_task = 0xc2f94000, active_mm = 0x0, local_pages = {next = 0xc2da0054, prev = 0xc2da0054}, allocation_order = 0, nr_local_pages = 0, ... 5. To resume Xen, enter the "continue" command to gdb. This sends the packet $c#63 along the serial channel. (gdb) cont Continuing. Debugging Multiple Domains & Processes -------------------------------------- pdb supports debugging multiple domains & processes. You can switch between different domains and processes within domains and examine variables in each. The pdb context identifies the current debug target. It is stored in the xen variable pdb_ctx and defaults to xen. target pdb_ctx.domain pdb_ctx.process ------ -------------- --------------- xen -1 -1 guest os 0,1,2,... -1 process 0,1,2,... 0,1,2,... Unfortunately, gdb doesn't understand debugging multiple process simultaneously (we're working on it), so at present you are limited to just one set of symbols for symbolic debugging. When debugging processes, pdb currently supports just Linux 2.4. define setup file xeno-clone/xeno.bk/xen/xen add-sym xeno-clone/xenolinux-2.4.25/vmlinux add-sym ~ach61/a.out end 1. Connect with gdb as before. A couple of Linux-specific symbols need to be defined. (gdb) target remote : /* contact nsplitd */ Remote debugging using serial.srg:12131 continue_cpu_idle_loop () at current.h:10 warning: shared library handler failed to enable breakpoint (gdb) set pdb_pidhash_addr = &pidhash (gdb) set pdb_init_task_union_addr = &init_task_union 2. The pdb context defaults to Xen and we can read Xen's memory. An attempt to access domain 0 memory fails. (gdb) print pdb_ctx $1 = {valid = 0, domain = -1, process = -1, ptbr = 1052672} (gdb) print hexchars $2 = "0123456789abcdef" (gdb) print cpu_vendor_names Cannot access memory at address 0xc0191f80 3. Now we change to domain 0. In addition to changing pdb_ctx.domain, we need to change pdb_ctx.valid to signal pdb of the change. It is now possible to examine Xen and Linux memory. (gdb) set pdb_ctx.domain=0 (gdb) set pdb_ctx.valid=1 (gdb) print hexchars $3 = "0123456789abcdef" (gdb) print cpu_vendor_names $4 = {0xc0158b46 "Intel", 0xc0158c37 "Cyrix", 0xc0158b55 "AMD", 0xc0158c3d "UMC", 0xc0158c41 "NexGen", 0xc0158c48 "Centaur", 0xc0158c50 "Rise", 0xc0158c55 "Transmeta"} 4. Now change to a process within domain 0. Again, we need to change pdb_ctx.valid in addition to pdb_ctx.process. (gdb) set pdb_ctx.process=962 (gdb) set pdb_ctx.valid =1 (gdb) print pdb_ctx $1 = {valid = 0, domain = 0, process = 962, ptbr = 52998144} (gdb) print aho_a $2 = 20 5. Now we can read the same variable from another process running the same executable in another domain. (gdb) set pdb_ctx.domain=1 (gdb) set pdb_ctx.process=1210 (gdb) set pdb_ctx.valid=1 (gdb) print pdb_ctx $3 = {valid = 0, domain = 1, process = 1210, ptbr = 70574080} (gdb) print aho_a $4 = 27 Changes ------- 04.02.05 aho creation 04.03.31 aho add description on debugging multiple domains