How to communicate with Linux kernel   " The power of assembly programming "

  1. Identity of the "bloated software"
  2. Independence from GLIBC
  3. Requirement of exit function
  4. How to communicate with system calls in assembly?
  5. Brief anatomy of Executable and Linking Format (ELF)
  6. Library independent "Hello World"

Last updated 2001-07-30 5:36 pm


Brief anatomy of Executable and Linking Format (ELF)

ELF is a widely used object format in the modern UNIX world. In Linux, recent distributors comilie all packages in this format. It is flexible, but quite difficult to understand the structure. My original utility, elfdump, analyzes and outputs a summary of ELF files.

     
$ elfdump test2 [ ELF header ] Header ID: 0x7F E L F ELF version: Current ELF header size: 52 File class: 32-bit objects Data encoding: Little endian Pad position: 0 File type: Executable Target CPU: 80386 File version: Current Entry address: 8048080 Processor flags: 0 Name table index: 7 * Program header table Offset: 52 Header size: 32 Number of headers: 2 Total header size: 64 * Section header table Offset: 332 Header size: 40 Number of headers: 10 Total header size: 400 [ Program headers ] Idx Type Offset VMA PMA FSIZ MSIZ Flag Algn ----------------------------------------------------------------- 0) LOAD 0 8048000 8048000 172 172 R X 4096 1) LOAD 172 80490AC 80490AC 0 0 RW 4096 [ Section headers ] Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl ------------------------------------------------------------------------- 0) NULL 0 0 0 0 0 0 0 1) .text PROG XA 8048080 128 44 0 0 16 0 2) .data PROG AW 80490AC 172 0 0 0 4 0 3) .sbss PROG W 80490AC 172 0 0 0 1 0 4) .bss NOSP AW 80490AC 172 0 0 0 4 0 5) .comment PROG 0 172 75 0 0 1 0 6) .note NOTE 0 247 20 0 0 1 0 7) .shstrtab STRT 0 267 65 0 0 1 0 8) .symtab SYMT 0 732 288 9 13 4 16 9) .strtab STRT 0 1020 69 0 0 1 0

This is the contents of test2. Please do not care trivial things. Important matter is there are 9 sections in test2. They are .text, .data, .sbss, .bss, .comment, .note, .shstrtab, .symtab, and .strtab. As you know, .text section contains real object code to execute and its size is only 44 bytes. What are .symtab and .strtab sections occupying most space in test2 (288+69=357 bytes)?. Let's inspect them by my dump utility.

     
$ dump test2 ... skipped ... 01008 003F0 | A0 80 04 08 00 00 00 00 10 00 01 00 00 74 65 73 | .............tes 01024 00400 | 74 32 2E 63 00 67 63 63 32 5F 63 6F 6D 70 69 6C | t2.c.gcc2_compil 01040 00410 | 65 64 2E 00 78 65 78 69 74 2E 61 73 6D 00 5F 5F | ed..xexit.asm.__ 01056 00420 | 62 73 73 5F 73 74 61 72 74 00 6D 61 69 6E 00 5F | bss_start.main._ 01072 00430 | 65 64 61 74 61 00 5F 65 6E 64 00 78 65 78 69 74 | edata._end.xexit

There are familiar names, main and xexit. .symtab and .strtab are sections for symbol names and they are not essentially required for process execution.

     
$ strip test2 $ l test2 -rwxr-xr-x 1 root src 652 Jan 6 21:36 test2 $ elfdump test2 ... skipped... [ Section headers ] Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl ------------------------------------------------------------------------- 0) NULL 0 0 0 0 0 0 0 1) .text PROG XA 8048080 128 44 0 0 16 0 2) .data PROG AW 80490AC 172 0 0 0 4 0 3) .sbss PROG W 80490AC 172 0 0 0 1 0 4) .bss NOSP AW 80490AC 172 0 0 0 4 0 5) .comment PROG 0 172 75 0 0 1 0 6) .note NOTE 0 247 20 0 0 1 0 7) .shstrtab STRT 0 267 65 0 0 1 0

'strip' strips .symtab and .strtab sections from a specified ELF file. As a result, the code size decreased from 1089 to 652 bytes. It is 437 bytes reduction! Furthermore, we find next target, .comment and .note. What are these sections?

     
$ dump test2 ... skipped ... 00160 000A0 | 89 E5 8B 5D 04 B8 01 00 00 00 CD 80 00 47 43 43 | ...].........GCC 00176 000B0 | 3A 20 28 47 4E 55 29 20 32 2E 39 35 2E 32 20 32 | : (GNU) 2.95.2 2 00192 000C0 | 30 30 30 30 32 32 30 20 28 44 65 62 69 61 6E 20 | 0000220 (Debian 00208 000D0 | 47 4E 55 2F 4C 69 6E 75 78 29 00 00 54 68 65 20 | GNU/Linux)..The 00224 000E0 | 4E 65 74 77 69 64 65 20 41 73 73 65 6D 62 6C 65 | Netwide Assemble 00240 000F0 | 72 20 30 2E 39 38 00 08 00 00 00 00 00 00 00 01 | r 0.98..........

These sections include version names of GCC and NASM. Delete them as follows.

     
$ strip --remove-section=.comment --remove-section=.note test2 $ ls -l test2 -rwxr-xr-x 1 root src 464 Jan 6 21:46 test2 $ elfdump test2 [ ELF header ] Header ID: 0x7F E L F ELF version: Current ELF header size: 52 File class: 32-bit objects Data encoding: Little endian Pad position: 0 File type: Executable Target CPU: 80386 File version: Current Entry address: 8048080 Processor flags: 0 Name table index: 2 * Program header table Offset: 52 Header size: 32 Number of headers: 1 Total header size: 32 * Section header table Offset: 208 Header size: 40 Number of headers: 3 Total header size: 120 [ Program headers ] Idx Type Offset VMA PMA FSIZ MSIZ Flag Algn ----------------------------------------------------------------- 0) LOAD 0 8048000 8048000 172 172 R X 4096 [ Section headers ] Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl ------------------------------------------------------------------------- 0) NULL 0 0 0 0 0 0 0 1) .text PROG XA 8048080 128 44 0 0 16 0 2) .data PROG AW 80490AC 172 0 0 0 4 0 3) .sbss PROG W 80490AC 172 0 0 0 1 0 4) .bss NOSP AW 80490AC 172 0 0 0 4 0 5) .shstrtab STRT 0 172 50 0 0 1 0

strip's special option, --remove-section=, removes specified section from a ELF file. The code size reached 464 bytes. It is one tenth of test0! Are there more flab? Yes, there are.

     
$ strip --remove-section=.data --remove-section=.sbss --remove-section=.bss test2 $ ls -l test2 -rwxr-xr-x 1 root src 328 Jan 11 18:55 a.out $ elfdump test2 ... skipped... [ Section headers ] Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl ------------------------------------------------------------------------- 0) NULL 0 0 0 0 0 0 0 1) .text PROG XA 8048080 128 44 0 0 16 0 2) .shstrtab STRT 0 172 33 0 0 1 0 $ ./test2 ; echo $? 123

We finally removed .data, .sbss, and .bss sections and the code size is now 328 bytes. These sections are for data storage, so we usually have to include them. Presented example is a special case, but it works fine.

$ dump test2
... skipped ...
     
00160 000A0 | 89 E5 8B 5D 04 B8 01 00 00 00 CD 80 00 2E 73 79 | ...]..........sy 00176 000B0 | 6D 74 61 62 00 2E 73 74 72 74 61 62 00 2E 73 68 | mtab..strtab..sh 00192 000C0 | 73 74 72 74 61 62 00 2E 74 65 78 74 00 00 00 00 | strtab..text....

Final section .shstrtab is an essential one holding section names. We must leave it. If you insist on further code saving, there is a definitive reference. The article presents a method to manually write down an ELF executable file in assembly, then test3 would be less than 50 bytes! However, the program no longer has a linkage to C language and you have to code all parts in assembly. I think this is not a practical approcah.

Lastly, I'll present a "Hello World" program which is free of GLIBC in next section.