Assembly Programming -- section 4

How to communicate with Linux kernel " The power of assembly programming "

Identity of the "bloated software"
Independence from GLIBC
Requirement of exit function
How to communicate with system calls in assembly?
Brief anatomy of Executable and Linking Format (ELF)
Library independent "Hello World"

Last updated 2001-07-30 5:36 pm

How to communicate with system calls in assembly?

xexit.asm

     
;
; N A M E : x e x i t . a s m
;
; D E S C : exit() by assembly
;
; A U T H : Wataru Nishida, M.D., Ph.D.
;           wnishida@skyfree.org
;           http://www.skyfree.org
;
; M A K E : nasm -f elf xexit.asm
;
; V E R S : 1.0
;
; D A T E : Jan. 5, 2001
;

bits    32              ; Use 32bit mode.

; [ NOTE ] Code starts from here.

section .text           ; Code must resides in the ".text" in GCC.
global  xexit           ; Declare xexit() as a public function.

;
;   x e x i t
;
;       void xexit (int status)
;
;                                               Jan. 5, 2001

xexit:
        mov     ebp, esp                ; Now, [ ebp ] points return address.
                                        ; and [ ebp+4 ] points 1st argument.

        mov     ebx, [ ebp + 4 ]        ; Read status code.
        mov     eax, 1                ; Call sys_exit().
        int     0x80
                                        ; Never returns.

Above list is a assmebly source of a new exit funcion, xexit(). xexit is described using NASM assembly format. NASM (Netwide Assembler) is a free assembler for the Intel x86 CPUs and it uses traditional Intel mnemonics and syntax. There is another free assembler, gas (GNU Assembler), but it uses notorious AT&T mnemonics and does not support 16bit mode. I had been using TASM on DOS and familiar with Intel mnemonics, so I choosed NASM without hesitation. You can easily assembles the source by "nasm -f elf xexit.asm". -f elf option means creating a ELF compatible object format. Then, focus on the source.

Declare 32bit mode at the beginng (bits 32; in case of 16bit mode, use bits 16).
Use section .text for the code segment (I'm afraid whether "segment" is a right term in this context or not.).
Declare xexit as a global function name. If you forget it, you can't call xexit from C source file.
The code begins under "exit:".
Read passed argument into EBX. In C language, caller puts arguments and return address into a stack (Usually, stacking is performed by a unit of word, 32bits). [ ebp ] points return address, [ ebp+4 ] points first argument, [ ebp+8 ] points second argument, and so on.
Set one in EAX. This is a number of exit system call. UNIX kernel supports many system calls for a communication with process. You can see all of supported system calls in /usr/include/asm/unistd.h (listed below), or there is a nice page you should visit.
Finally, execute exit system call by software interrupt 128 (0x80).
This is the essence of system call formula in assembly. It requires only 4 instructions!

/usr/include/asm/unistd.h

     
/*
 * This file contains the system call numbers.
 */

#define __NR_exit                 1
#define __NR_fork                 2
#define __NR_read                 3
#define __NR_write                4
#define __NR_open                 5
#define __NR_close                6
#define __NR_waitpid              7
#define __NR_creat                8
#define __NR_link                 9
#define __NR_unlink              10
...continue...

To utilize new xexit() function, we have to rewrite C source list as follows.

test2.c

     
extern void xexit(int);

main() {
  xexit(123);
 }

Function name is changed from exit() to xexit(), and its prototype definition is included. Then, link it.

     
$ gcc -c test2.c
$ ld -e main -o test2 test2.o xexit.o
$ ls -l test2*
-rwxr-xr-x    1 root     src          1089 Jan  6 21:32 test2
-rw-------    1 root     src            51 Jan  5 19:10 test2.c
-rw-r--r--    1 root     src           816 Jan  6 21:32 test2.o

$ ldd test2
        statically linked (ELF)

$ nm test2
080490ac A __bss_start
080490ac A _edata
080490ac A _end
08048080 t gcc2_compiled.
08048080 T main
080480a0 T xexit

$ ./test2 ; echo $?
123

Wow, new code is successfully created and it is completely free of library. We made it and it functions! However, I think the code size is still larger than I expected. As mentioned abode, main() and xexit() contain at most 10 statements. Its object code requires less than 100 bytes. There must be another flab. In the next section, we investigate the object code itself.