How to communicate with Linux kernel   " The power of assembly programming "

  1. Identity of the "bloated software"
  2. Independence from GLIBC
  3. Requirement of exit function
  4. How to communicate with system calls in assembly?
  5. Brief anatomy of Executable and Linking Format (ELF)
  6. Library independent "Hello World"

Last updated 2001-07-30 5:36 pm


How to communicate with system calls in assembly?

xexit.asm
     
; ; N A M E : x e x i t . a s m ; ; D E S C : exit() by assembly ; ; A U T H : Wataru Nishida, M.D., Ph.D. ; wnishida@skyfree.org ; http://www.skyfree.org ; ; M A K E : nasm -f elf xexit.asm ; ; V E R S : 1.0 ; ; D A T E : Jan. 5, 2001 ; bits 32 ; Use 32bit mode. ; [ NOTE ] Code starts from here. section .text ; Code must resides in the ".text" in GCC. global xexit ; Declare xexit() as a public function. ; ; x e x i t ; ; void xexit (int status) ; ; Jan. 5, 2001 xexit: mov ebp, esp ; Now, [ ebp ] points return address. ; and [ ebp+4 ] points 1st argument. mov ebx, [ ebp + 4 ] ; Read status code. mov eax, 1 ; Call sys_exit(). int 0x80 ; Never returns.

Above list is a assmebly source of a new exit funcion, xexit(). xexit is described using NASM assembly format. NASM (Netwide Assembler) is a free assembler for the Intel x86 CPUs and it uses traditional Intel mnemonics and syntax. There is another free assembler, gas (GNU Assembler), but it uses notorious AT&T mnemonics and does not support 16bit mode. I had been using TASM on DOS and familiar with Intel mnemonics, so I choosed NASM without hesitation. You can easily assembles the source by "nasm -f elf xexit.asm". -f elf option means creating a ELF compatible object format. Then, focus on the source.

  1. Declare 32bit mode at the beginng (bits 32; in case of 16bit mode, use bits 16).
  2. Use section .text for the code segment (I'm afraid whether "segment" is a right term in this context or not.).
  3. Declare xexit as a global function name. If you forget it, you can't call xexit from C source file.
  4. The code begins under "exit:".
  5. Read passed argument into EBX. In C language, caller puts arguments and return address into a stack (Usually, stacking is performed by a unit of word, 32bits). [ ebp ] points return address, [ ebp+4 ] points first argument, [ ebp+8 ] points second argument, and so on.
  6. Set one in EAX. This is a number of exit system call. UNIX kernel supports many system calls for a communication with process. You can see all of supported system calls in /usr/include/asm/unistd.h (listed below), or there is a nice page you should visit.
  7. Finally, execute exit system call by software interrupt 128 (0x80).
  8. This is the essence of system call formula in assembly. It requires only 4 instructions!
/usr/include/asm/unistd.h
     
/* * This file contains the system call numbers. */ #define __NR_exit 1 #define __NR_fork 2 #define __NR_read 3 #define __NR_write 4 #define __NR_open 5 #define __NR_close 6 #define __NR_waitpid 7 #define __NR_creat 8 #define __NR_link 9 #define __NR_unlink 10 ...continue...

To utilize new xexit() function, we have to rewrite C source list as follows.

test2.c
     
extern void xexit(int); main() { xexit(123); }

Function name is changed from exit() to xexit(), and its prototype definition is included. Then, link it.

     
$ gcc -c test2.c $ ld -e main -o test2 test2.o xexit.o $ ls -l test2* -rwxr-xr-x 1 root src 1089 Jan 6 21:32 test2 -rw------- 1 root src 51 Jan 5 19:10 test2.c -rw-r--r-- 1 root src 816 Jan 6 21:32 test2.o $ ldd test2 statically linked (ELF) $ nm test2 080490ac A __bss_start 080490ac A _edata 080490ac A _end 08048080 t gcc2_compiled. 08048080 T main 080480a0 T xexit $ ./test2 ; echo $? 123

Wow, new code is successfully created and it is completely free of library. We made it and it functions! However, I think the code size is still larger than I expected. As mentioned abode, main() and xexit() contain at most 10 statements. Its object code requires less than 100 bytes. There must be another flab. In the next section, we investigate the object code itself.