How to communicate with Linux kernel
" The power of assembly programming "
- Identity of the "bloated software"
- Independence from GLIBC
- Requirement of exit function
- How to communicate with system calls in assembly?
- Brief anatomy of Executable and Linking Format (ELF)
- Library independent "Hello World"
Last updated 2001-07-30 5:36 pm
How to communicate with system calls in assembly?
xexit.asm |
;
; N A M E : x e x i t . a s m
;
; D E S C : exit() by assembly
;
; A U T H : Wataru Nishida, M.D., Ph.D.
; wnishida@skyfree.org
; http://www.skyfree.org
;
; M A K E : nasm -f elf xexit.asm
;
; V E R S : 1.0
;
; D A T E : Jan. 5, 2001
;
bits 32 ; Use 32bit mode.
; [ NOTE ] Code starts from here.
section .text ; Code must resides in the ".text" in GCC.
global xexit ; Declare xexit() as a public function.
;
; x e x i t
;
; void xexit (int status)
;
; Jan. 5, 2001
xexit:
mov ebp, esp ; Now, [ ebp ] points return address.
; and [ ebp+4 ] points 1st argument.
mov ebx, [ ebp + 4 ] ; Read status code.
mov eax, 1 ; Call sys_exit().
int 0x80
; Never returns.
|
Above list is a assmebly source of a new exit funcion, xexit(). xexit
is described using NASM assembly format. NASM
(Netwide Assembler) is a free assembler for the Intel x86 CPUs and it
uses traditional Intel mnemonics and syntax. There is another free
assembler, gas (GNU Assembler), but it uses notorious AT&T mnemonics and
does not support 16bit mode. I had been using TASM on DOS and familiar with
Intel mnemonics, so I choosed NASM without hesitation. You can easily assembles
the source by "nasm -f elf xexit.asm".
-f elf option means creating a ELF compatible object format.
Then, focus on the source.
- Declare 32bit mode at the beginng (bits 32; in case of 16bit mode,
use bits 16).
- Use section .text for the code segment (I'm afraid whether
"segment" is a right term in this context or not.).
- Declare xexit as a global function name. If you forget it, you can't
call xexit from C source file.
- The code begins under "exit:".
- Read passed argument into EBX. In C language, caller
puts arguments and return address into a stack (Usually, stacking
is performed by a unit of word, 32bits). [ ebp ] points return address,
[ ebp+4 ] points first argument, [ ebp+8 ] points second argument, and so
on.
- Set one in EAX. This is a number of exit system
call. UNIX kernel supports many system calls for a communication
with process. You can see all of supported system calls in /usr/include/asm/unistd.h
(listed below), or there is a nice page
you should visit.
- Finally, execute exit system call by software interrupt 128 (0x80).
- This is the essence of system call formula in assembly. It requires only
4 instructions!
/usr/include/asm/unistd.h |
/*
* This file contains the system call numbers.
*/
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
#define __NR_open 5
#define __NR_close 6
#define __NR_waitpid 7
#define __NR_creat 8
#define __NR_link 9
#define __NR_unlink 10
...continue...
|
To utilize new xexit() function, we have to rewrite C source list as follows.
test2.c |
extern void xexit(int);
main() {
xexit(123);
}
|
Function name is changed from exit() to xexit(), and its prototype definition
is included. Then, link it.
$ gcc -c test2.c
$ ld -e main -o test2 test2.o xexit.o
$ ls -l test2*
-rwxr-xr-x 1 root src 1089 Jan 6 21:32 test2
-rw------- 1 root src 51 Jan 5 19:10 test2.c
-rw-r--r-- 1 root src 816 Jan 6 21:32 test2.o
$ ldd test2
statically linked (ELF)
$ nm test2
080490ac A __bss_start
080490ac A _edata
080490ac A _end
08048080 t gcc2_compiled.
08048080 T main
080480a0 T xexit
$ ./test2 ; echo $?
123
|
Wow, new code is successfully created and it is completely free of library.
We made it and it functions! However, I think the code size is still larger
than I expected. As mentioned abode, main() and xexit() contain at most 10
statements. Its object code requires less than 100 bytes. There must be another
flab. In the next section, we investigate the object code itself.