How to communicate with Linux kernel
" The power of assembly programming "
- Identity of the "bloated software"
- Independence from GLIBC
- Requirement of exit function
- How to communicate with system calls in assembly?
- Brief anatomy of Executable and Linking Format (ELF)
- Library independent "Hello World"
Last updated 2001-07-30 5:36 pm
Brief anatomy of Executable and Linking Format (ELF)
ELF is a widely used object format in the modern UNIX world. In Linux, recent
distributors comilie all packages in this format. It is flexible, but quite
difficult to understand the structure. My original utility, elfdump,
analyzes and outputs a summary of ELF files.
$ elfdump test2
[ ELF header ]
Header ID: 0x7F E L F
ELF version: Current
ELF header size: 52
File class: 32-bit objects
Data encoding: Little endian
Pad position: 0
File type: Executable
Target CPU: 80386
File version: Current
Entry address: 8048080
Processor flags: 0
Name table index: 7
* Program header table
Offset: 52
Header size: 32
Number of headers: 2
Total header size: 64
* Section header table
Offset: 332
Header size: 40
Number of headers: 10
Total header size: 400
[ Program headers ]
Idx Type Offset VMA PMA FSIZ MSIZ Flag Algn
-----------------------------------------------------------------
0) LOAD 0 8048000 8048000 172 172 R X 4096
1) LOAD 172 80490AC 80490AC 0 0 RW 4096
[ Section headers ]
Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl
-------------------------------------------------------------------------
0) NULL 0 0 0 0 0 0 0
1) .text PROG XA 8048080 128 44 0 0 16 0
2) .data PROG AW 80490AC 172 0 0 0 4 0
3) .sbss PROG W 80490AC 172 0 0 0 1 0
4) .bss NOSP AW 80490AC 172 0 0 0 4 0
5) .comment PROG 0 172 75 0 0 1 0
6) .note NOTE 0 247 20 0 0 1 0
7) .shstrtab STRT 0 267 65 0 0 1 0
8) .symtab SYMT 0 732 288 9 13 4 16
9) .strtab STRT 0 1020 69 0 0 1 0
|
This is the contents of test2. Please do not care trivial things. Important
matter is there are 9 sections in test2. They are .text, .data, .sbss,
.bss, .comment, .note, .shstrtab, .symtab, and .strtab. As you know, .text
section contains real object code to execute and its size is only 44 bytes.
What are .symtab and .strtab sections occupying most space in test2
(288+69=357 bytes)?. Let's inspect them by my dump
utility.
$ dump test2
... skipped ...
01008 003F0 | A0 80 04 08 00 00 00 00 10 00 01 00 00 74 65 73 | .............tes
01024 00400 | 74 32 2E 63 00 67 63 63 32 5F 63 6F 6D 70 69 6C | t2.c.gcc2_compil
01040 00410 | 65 64 2E 00 78 65 78 69 74 2E 61 73 6D 00 5F 5F | ed..xexit.asm.__
01056 00420 | 62 73 73 5F 73 74 61 72 74 00 6D 61 69 6E 00 5F | bss_start.main._
01072 00430 | 65 64 61 74 61 00 5F 65 6E 64 00 78 65 78 69 74 | edata._end.xexit
|
There are familiar names, main and xexit. .symtab and .strtab are
sections for symbol names and they are not essentially required for
process execution.
$ strip test2
$ l test2
-rwxr-xr-x 1 root src 652 Jan 6 21:36 test2
$ elfdump test2
... skipped...
[ Section headers ]
Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl
-------------------------------------------------------------------------
0) NULL 0 0 0 0 0 0 0
1) .text PROG XA 8048080 128 44 0 0 16 0
2) .data PROG AW 80490AC 172 0 0 0 4 0
3) .sbss PROG W 80490AC 172 0 0 0 1 0
4) .bss NOSP AW 80490AC 172 0 0 0 4 0
5) .comment PROG 0 172 75 0 0 1 0
6) .note NOTE 0 247 20 0 0 1 0
7) .shstrtab STRT 0 267 65 0 0 1 0
|
'strip' strips .symtab and .strtab sections from a specified ELF file.
As a result, the code size decreased from 1089 to 652 bytes. It is 437
bytes reduction! Furthermore, we find next target, .comment and .note.
What are these sections?
$ dump test2
... skipped ...
00160 000A0 | 89 E5 8B 5D 04 B8 01 00 00 00 CD 80 00 47 43 43 | ...].........GCC
00176 000B0 | 3A 20 28 47 4E 55 29 20 32 2E 39 35 2E 32 20 32 | : (GNU) 2.95.2 2
00192 000C0 | 30 30 30 30 32 32 30 20 28 44 65 62 69 61 6E 20 | 0000220 (Debian
00208 000D0 | 47 4E 55 2F 4C 69 6E 75 78 29 00 00 54 68 65 20 | GNU/Linux)..The
00224 000E0 | 4E 65 74 77 69 64 65 20 41 73 73 65 6D 62 6C 65 | Netwide Assemble
00240 000F0 | 72 20 30 2E 39 38 00 08 00 00 00 00 00 00 00 01 | r 0.98..........
|
These sections include version names of GCC and NASM. Delete them as follows.
$ strip --remove-section=.comment --remove-section=.note test2
$ ls -l test2
-rwxr-xr-x 1 root src 464 Jan 6 21:46 test2
$ elfdump test2
[ ELF header ]
Header ID: 0x7F E L F
ELF version: Current
ELF header size: 52
File class: 32-bit objects
Data encoding: Little endian
Pad position: 0
File type: Executable
Target CPU: 80386
File version: Current
Entry address: 8048080
Processor flags: 0
Name table index: 2
* Program header table
Offset: 52
Header size: 32
Number of headers: 1
Total header size: 32
* Section header table
Offset: 208
Header size: 40
Number of headers: 3
Total header size: 120
[ Program headers ]
Idx Type Offset VMA PMA FSIZ MSIZ Flag Algn
-----------------------------------------------------------------
0) LOAD 0 8048000 8048000 172 172 R X 4096
[ Section headers ]
Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl
-------------------------------------------------------------------------
0) NULL 0 0 0 0 0 0 0
1) .text PROG XA 8048080 128 44 0 0 16 0
2) .data PROG AW 80490AC 172 0 0 0 4 0
3) .sbss PROG W 80490AC 172 0 0 0 1 0
4) .bss NOSP AW 80490AC 172 0 0 0 4 0
5) .shstrtab STRT 0 172 50 0 0 1 0
|
strip's special option, --remove-section=, removes specified section
from a ELF file. The code size reached 464 bytes. It is one tenth
of test0! Are there more flab? Yes, there are.
$ strip --remove-section=.data --remove-section=.sbss --remove-section=.bss test2
$ ls -l test2
-rwxr-xr-x 1 root src 328 Jan 11 18:55 a.out
$ elfdump test2
... skipped...
[ Section headers ]
Idx Name Type Flag VMA Offset Size Lnk Inf Algn Tbl
-------------------------------------------------------------------------
0) NULL 0 0 0 0 0 0 0
1) .text PROG XA 8048080 128 44 0 0 16 0
2) .shstrtab STRT 0 172 33 0 0 1 0
$ ./test2 ; echo $?
123
|
We finally removed .data, .sbss, and .bss sections and the code size is now
328 bytes. These sections are for data storage, so we usually have
to include them. Presented example is a special case, but it works fine.
$ dump test2
... skipped ...
00160 000A0 | 89 E5 8B 5D 04 B8 01 00 00 00 CD 80 00 2E 73 79 | ...]..........sy
00176 000B0 | 6D 74 61 62 00 2E 73 74 72 74 61 62 00 2E 73 68 | mtab..strtab..sh
00192 000C0 | 73 74 72 74 61 62 00 2E 74 65 78 74 00 00 00 00 | strtab..text....
|
Final section
.shstrtab is an essential one holding section names. We
must leave it. If you insist on further code saving, there is a
definitive reference. The article presents a method to manually write
down an ELF executable file in assembly, then test3 would be less than 50
bytes! However, the program no longer has a linkage to C language and you
have to code all parts in assembly. I think this is not a practical approcah.
Lastly, I'll present a "Hello World" program which is free of GLIBC
in next section.