How to communicate with Linux kernel   " The power of assembly programming "

  1. Identity of the "bloated software"
  2. Independence from GLIBC
  3. Requirement of exit function
  4. How to communicate with system calls in assembly?
  5. Brief anatomy of Executable and Linking Format (ELF)
  6. Library independent "Hello World"

Last updated 2001-07-30 5:34 pm


Identity of the "bloated software"

test0.c
     
int main() { return(123); }

As you can see, this program simply exit the "main()" function with a return code of 123. Let's compile the source and link it.

     
$ gcc -o test0 test0.c $ ls -l test0 -rwxr-xr-x 1 root src 4636 Jan 6 20:38 test0 $ ./test0 ; echo $? 123

The code size is 4636 bytes (so huge!) and the process successfully returns 123 to the shell (environmental variable $? means the last exit code). Then, examine library dependency (ldd) and symbol name definitions (nm) of the code.


$ ldd test0 libc.so.6 => /lib/libc.so.6 (0x40017000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) $ nm test0 08048314 t Letext 0804945c ? _DYNAMIC 08049440 ? _GLOBAL_OFFSET_TABLE_ 0804841c R _IO_stdin_used 08049434 ? __CTOR_END__ 08049430 ? __CTOR_LIST__ 0804943c ? __DTOR_END__ 08049438 ? __DTOR_LIST__ 0804942c ? __EH_FRAME_BEGIN__ 0804942c ? __FRAME_END__ 080494fc A __bss_start 08049420 D __data_start w __deregister_frame_info@@GLIBC_2.0 080483d0 t __do_global_ctors_aux 08048320 t __do_global_dtors_aux w __gmon_start__ U __libc_start_main@@GLIBC_2.0 w __register_frame_info@@GLIBC_2.0 080494fc A _edata 08049514 A _end 080483fc ? _fini U _fp_hw 08048274 ? _init 080482f0 T _start 08049428 d completed.4 08049420 W data_start 08048370 t fini_dummy 0804942c d force_to_data 0804942c d force_to_data 08048378 t frame_dummy 08048314 t gcc2_compiled. 08048320 t gcc2_compiled. 080483d0 t gcc2_compiled. 080483fc t gcc2_compiled. 080483b0 t gcc2_compiled. 0804839c t init_dummy 080483f4 t init_dummy 080483b0 T main 080494fc b object.11 08049424 d p.3

ldd tells test0 depends on libc.so.6 (GNU libc6), and nm outputs so many unfamiliar symbols. We do not concern precise meanings of the symbols but care "T main" near at the bottom. This is the famous name we defined in the source. By the way, smart readers may notice that huge code size is related with these unknown symbols. Yes, that's right!

     
$ gcc -v -o test0 test0.c Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs gcc version 2.95.2 20000220 (Debian GNU/Linux) /usr/lib/gcc-lib/i386-linux/2.95.2/cpp -lang-c -v -D__GNUC__=2 -D__GNUC_MINOR__=95 -D__ELF__ -Dunix -D__i386__ -Dlinux -D__ELF__ -D__unix__ -D__i386__ -D__linux__ -D__unix -D__linux -Asystem(posix) -Acpu(i386) -Amachine(i386) -Di386 -D__i386 -D__i386__ test0.c /tmp/cc6P8NZ9.i GNU CPP version 2.95.2 20000220 (Debian GNU/Linux) (i386 Linux/ELF) #include "..." search starts here: #include <...> search starts here: /usr/local/include /usr/lib/gcc-lib/i386-linux/2.95.2/include /usr/include End of search list. The following default directories have been omitted from the search path: /usr/lib/gcc-lib/i386-linux/2.95.2/../../../../include/g++-3 /usr/lib/gcc-lib/i386-linux/2.95.2/../../../../i386-linux/include End of omitted list. /usr/lib/gcc-lib/i386-linux/2.95.2/cc1 /tmp/cc6P8NZ9.i -quiet -dumpbase test0.c -version -o /tmp/ccgz1veb.s GNU C version 2.95.2 20000220 (Debian GNU/Linux) (i386-linux) compiled by GNU C version 2.95.2 20000220 (Debian GNU/Linux). as -V -Qy -o /tmp/ccgJ2zye.o /tmp/ccgz1veb.s GNU assembler version 2.9.5 (i386-linux) using BFD version 2.9.5.0.37 /usr/lib/gcc-lib/i386-linux/2.95.2/collect2 -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test0 /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/gcc-lib/i386-linux/2.95.2/crtbegin.o -L/usr/lib/gcc-lib/i386-linux/2.95.2 /tmp/ccgJ2zye.o -lgcc -lc -lgcc /usr/lib/gcc-lib/i386-linux/2.95.2/crtend.o /usr/lib/crtn.o

GCC automatically executes several tools during compilation, but we do not usually notice it. To confirm the background execution, specify -v option. This option shows executing commands during compilation. GCC runs the following programs.

1) "cpp ... test0.c /tmp/cc6P8NZ9.i" preprocesses test0.c source listing and outputs it in /tmp/cc6P8NZ9.i.

2) "cc1 /tmp/cc6P8NZ9.i ... /tmp/ccgz1veb.s" analyzees and compiles the preprocessed /tmp/cc6P8NZ9.i and output an assembly source /tmp/ccgz1veb.s.

3) "as ... -o /tmp/ccgJ2zye.o /tmp/ccgz1veb.s" assembles /tmp/ccgz1veb.s and output compiled object code /tmp/ccgJ2zye.o.

4) Finally "collect2 -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test0 ... crt1.o crti.o crtbegin.o /tmp/ccgJ2zye.o -lgcc -lc crtend.o crtn.o" links the compiled object file with five predefined objects (crtxxx.o; listed below). The linker also refer to GCC and C libraries, and create an executable file test0 (elf_i386 format).

     
-rw-r--r-- 1 root root 1188 May 2 2000 /usr/lib/crt1.o -rw-r--r-- 1 root root 1096 May 2 2000 /usr/lib/crti.o -rw-r--r-- 1 root root 827 May 2 2000 /usr/lib/crtn.o -rw-r--r-- 1 root root 1900 Jun 20 2000 /usr/lib/gcc-lib/i386-linux/2.95.2/crtbegin.o -rw-r--r-- 1 root root 1408 Jun 20 2000 /usr/lib/gcc-lib/i386-linux/2.95.2/crtend.o

Now, we found the identity of "bloated software". The overhead is came from many startup files and library dependency. In next section, we try to be freed from GNU libc.