How to communicate with Linux kernel
" The power of assembly programming "
- Identity of the "bloated software"
- Independence from GLIBC
- Requirement of exit function
- How to communicate with system calls in assembly?
- Brief anatomy of Executable and Linking Format (ELF)
- Library independent "Hello World"
Last updated 2001-07-30 5:34 pm
Identity of the "bloated software"
test0.c |
int main() {
return(123);
}
|
As you can see, this program simply exit the "main()" function with
a return code of 123. Let's compile the source and link it.
$ gcc -o test0 test0.c
$ ls -l test0
-rwxr-xr-x 1 root src 4636 Jan 6 20:38 test0
$ ./test0 ; echo $?
123
|
The code size is 4636 bytes (so huge!) and the process successfully
returns 123 to the shell (environmental variable $? means the last exit code).
Then, examine library dependency (ldd)
and symbol name definitions (nm)
of the code.
$ ldd test0
libc.so.6 => /lib/libc.so.6 (0x40017000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
$ nm test0
08048314 t Letext
0804945c ? _DYNAMIC
08049440 ? _GLOBAL_OFFSET_TABLE_
0804841c R _IO_stdin_used
08049434 ? __CTOR_END__
08049430 ? __CTOR_LIST__
0804943c ? __DTOR_END__
08049438 ? __DTOR_LIST__
0804942c ? __EH_FRAME_BEGIN__
0804942c ? __FRAME_END__
080494fc A __bss_start
08049420 D __data_start
w __deregister_frame_info@@GLIBC_2.0
080483d0 t __do_global_ctors_aux
08048320 t __do_global_dtors_aux
w __gmon_start__
U __libc_start_main@@GLIBC_2.0
w __register_frame_info@@GLIBC_2.0
080494fc A _edata
08049514 A _end
080483fc ? _fini
U _fp_hw
08048274 ? _init
080482f0 T _start
08049428 d completed.4
08049420 W data_start
08048370 t fini_dummy
0804942c d force_to_data
0804942c d force_to_data
08048378 t frame_dummy
08048314 t gcc2_compiled.
08048320 t gcc2_compiled.
080483d0 t gcc2_compiled.
080483fc t gcc2_compiled.
080483b0 t gcc2_compiled.
0804839c t init_dummy
080483f4 t init_dummy
080483b0 T main
080494fc b object.11
08049424 d p.3
|
ldd tells test0 depends on libc.so.6 (GNU libc6), and nm outputs
so many unfamiliar symbols. We do not concern precise meanings of the symbols
but care "T main" near at the bottom. This is the famous
name we defined in the source. By the way, smart readers may notice that huge
code size is related with these unknown symbols. Yes, that's right!
$ gcc -v -o test0 test0.c
Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs
gcc version 2.95.2 20000220 (Debian GNU/Linux)
/usr/lib/gcc-lib/i386-linux/2.95.2/cpp -lang-c -v -D__GNUC__=2 -D__GNUC_MINOR__=95 -D__ELF__ -Dunix -D__i386__ -Dlinux
-D__ELF__ -D__unix__ -D__i386__ -D__linux__ -D__unix -D__linux -Asystem(posix) -Acpu(i386) -Amachine(i386) -Di386 -D__i386
-D__i386__ test0.c /tmp/cc6P8NZ9.i
GNU CPP version 2.95.2 20000220 (Debian GNU/Linux) (i386 Linux/ELF)
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/usr/lib/gcc-lib/i386-linux/2.95.2/include
/usr/include
End of search list.
The following default directories have been omitted from the search path:
/usr/lib/gcc-lib/i386-linux/2.95.2/../../../../include/g++-3
/usr/lib/gcc-lib/i386-linux/2.95.2/../../../../i386-linux/include
End of omitted list.
/usr/lib/gcc-lib/i386-linux/2.95.2/cc1 /tmp/cc6P8NZ9.i -quiet -dumpbase test0.c -version -o /tmp/ccgz1veb.s
GNU C version 2.95.2 20000220 (Debian GNU/Linux) (i386-linux) compiled by GNU C version 2.95.2 20000220 (Debian GNU/Linux).
as -V -Qy -o /tmp/ccgJ2zye.o /tmp/ccgz1veb.s
GNU assembler version 2.9.5 (i386-linux) using BFD version 2.9.5.0.37
/usr/lib/gcc-lib/i386-linux/2.95.2/collect2 -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test0 /usr/lib/crt1.o
/usr/lib/crti.o /usr/lib/gcc-lib/i386-linux/2.95.2/crtbegin.o -L/usr/lib/gcc-lib/i386-linux/2.95.2 /tmp/ccgJ2zye.o
-lgcc -lc -lgcc /usr/lib/gcc-lib/i386-linux/2.95.2/crtend.o /usr/lib/crtn.o
|
GCC automatically executes several tools during compilation, but we do not
usually notice it. To confirm the background execution, specify -v
option. This option shows executing commands during compilation. GCC runs
the following programs.
1) "cpp ... test0.c /tmp/cc6P8NZ9.i" preprocesses test0.c
source listing and outputs it in /tmp/cc6P8NZ9.i.
2) "cc1 /tmp/cc6P8NZ9.i ... /tmp/ccgz1veb.s" analyzees and compiles
the preprocessed /tmp/cc6P8NZ9.i and output an assembly source /tmp/ccgz1veb.s.
3) "as ... -o /tmp/ccgJ2zye.o /tmp/ccgz1veb.s"
assembles /tmp/ccgz1veb.s and output compiled
object code /tmp/ccgJ2zye.o.
4) Finally "collect2 -m elf_i386 -dynamic-linker /lib/ld-linux.so.2
-o test0 ... crt1.o crti.o crtbegin.o /tmp/ccgJ2zye.o -lgcc -lc crtend.o crtn.o"
links the compiled object file with five predefined objects (crtxxx.o;
listed below). The linker also refer to GCC and C libraries, and create an
executable file test0 (elf_i386 format).
-rw-r--r-- 1 root root 1188 May 2 2000 /usr/lib/crt1.o
-rw-r--r-- 1 root root 1096 May 2 2000 /usr/lib/crti.o
-rw-r--r-- 1 root root 827 May 2 2000 /usr/lib/crtn.o
-rw-r--r-- 1 root root 1900 Jun 20 2000 /usr/lib/gcc-lib/i386-linux/2.95.2/crtbegin.o
-rw-r--r-- 1 root root 1408 Jun 20 2000 /usr/lib/gcc-lib/i386-linux/2.95.2/crtend.o
|
Now, we found the identity of "bloated software". The overhead
is came from many startup files and library dependency. In next
section, we try to be freed from GNU libc.