- Documentation for KMSGDUMP v0.1 - [Mon Jun 28 03:24:47 MET DST 1999] - Willy Tarreau What is KMSGDUMP ? ~~~~~~~~~~~~~~~~~~ KMSGDUMP is an extension to SysRQ which allows the user on the console to dump the last kernel messages onto a floppy diskette, thus avoiding to take a pen and a paper to copy them when the system is stuck. Only 3"1/2, 1.44 MB diskettes are supported. Higher capacities might work, but lower should not because the system dumps 32 sectors at a time. If you really have a need for other capacities, please drop me a mail. How does it work ? ~~~~~~~~~~~~~~~~~~ There are two ways of getting a dump : - by pressing SysRQ-D (RightAlt - PrintScrn - D together) ; - after a kernel panic has occured, a dump may be automatically generated. Before anything else, you MUST KNOW that in order to get maximal chances to complete the dump succesfully, the CPU is rebooted in real mode and disk accesses are made via the Bios. This ensures that even if kernel memory is really corrupted, the dump still has chances to work, but this also implies that after a dump has occured, it is IMPOSSIBLE TO CONTINUE TO WORK WITH THE CURRENT KERNEL. You will have to REBOOT. So when your kernel still responds, you'd better get a similar dump by entering the following command : # dmesg > /dev/fd0 Description of the manual operation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When you hit SysRQ-D, the CPU immediately reboots in real mode, reinitializes parts of hardware (interrupt controller, ...) and begins to beep once a second waiting for you to press a key. The reason for this annoying beep is for you to be sure that the system is waiting for you. At the moment, 2 keys have a special meaning, and all others generate a dump : - H (or h, case doesn't matter) : completely halts the system. - B : do a cold reboot of the system through the Bios. If any other key is pressed, a dump of the kernel messages is sent to the diskette in the drive. At the moment, the drive number and track number are fixed to 0 and 0 in "arch/i386/kernel/process.c", which correspond to first track of drive A. In a later version, this will be configurable at boot time, on lilo command line, and from /proc/sys/kernel. Description of automated operation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Automated operation is performed by the system when a kernel panic occurs. In this case, the system lets you a few seconds if you want to try to play with SysRQ (unmount filesystems, sync, ...) and forces a dump request instead of hanging or rebooting. The time you have to try SysRQ is selectable by entering a number of seconds in "/proc/sys/kernel/panic". There are 5 automated operation modes : - don't dump, and halt - don't dump, and reboot (useful on a server) - dump, then halt - dump, then reboot (usefull on a server, too) - enter manual operation, just as if you had pressed SysRQ-D. Automatic dump is VERY DANGEROUS because if you left a diskette in the drive and the system hangs, your diskette will be destroyed. But under some circumstances (problem with interrupts, keyboard ...), this is the ONLY way to get a dump after a kernel panic. I suggest using this on important servers which don't keep diskette in their drives all the time. On such servers, the automatic reboot would be interesting so that the dead service quickly comes back. On a workstation, I suggest using manual operation because there are often diskettes in drives, and it's not that hard to hit a key and wait for the system to dump and reboot. Reading back the messages ~~~~~~~~~~~~~~~~~~~~~~~~~ If the kernel was compiled with the option "CONFIG_KMSG_DUMP_FAT", a small FAT filesystem (1 track) will be formated on the floppy, and the dump will reside in the only file named "MESSAGES.TXT". Messages will be readable by : - mounting the floppy as an "msdos" filesystem : # mount -t msdos /dev/fd0 /mnt # cat /mnt/messages.txt | tr -d '\000' - using mtools : # mtype a:messages.txt | tr -d '\000' I added 'tr -d \000' at the end of each line because the file is padded with zeroes, and if the messages buffer isn't full, then you'll get zeroes which are useless. If the option "CONFIG_KMSG_DUMP_FAT" is disabled, the messages will be at the first sector of the diskette. You'll read them like this : # dd if=/dev/fd0 bs=512 count=32 | tr -d '\000' For more information and/or suggestions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For more informations, you can email me at : willy@meta-x.org ( be patient, I read my mail when I can, and can't always reply. I'm used to "tail -1000 $MAIL|more" ) For suggestions, you can either email them to me, or share them with the Linux Kernel Mailing List : linux-kernel@vger.rutgers.edu Willy