Author: Mugabi Siro

Category: ELF Support

Summary:

This entry highlights the relationship between the sections and segments of an ELF executable, and the resulting Virtual Memory Areas (VMAs) by a running instance of the program. Listed ELF structures and constants are according to the definitions in the elf(5) manpage. Development platform used was Ubuntu 12.04 AMD64.

Tags: gnu/linux toolchain gnu linux s/w development elf support

Background

ELF Sections and Segments

An application basically consists of an executable file and zero or more shared object files. Executables and shared objects contain segments which are groupings of one or more sections. The loadable segments contribute to the program's process image and, thus, provide an execution view of the object file. Sections, on the other hand, hold the bulk of object file information for the linking1 view: instructions, data, symbol tables, relocation information, etc.

ELF: Linking View Vs. Execution View

See ELF Object File Format for a more detailed description of sections and segments.

Linux Virtual Memory Areas (VMAs)

In Linux, a process' (sparsely populated) linear address space is organised in sets of Virtual Memory Areas (VMAs). Each VMA is a contiguous chunk of related and allocated pages. An object file's loadable segment corresponds to at least one VMA mapping in the address space of its process image. In addition, the runtime heap and user stack are also distinct VMAs. For instance:

$ cat /proc/self/maps 
00400000-0040b000 r-xp 00000000 08:08 884809                             /bin/cat
0060a000-0060b000 r--p 0000a000 08:08 884809                             /bin/cat
0060b000-0060c000 rw-p 0000b000 08:08 884809                             /bin/cat
010a0000-010c1000 rw-p 00000000 00:00 0                                  [heap]
7fbf48526000-7fbf48c09000 r--p 00000000 08:08 1040385                    /usr/lib/locale/locale-archive
7fbf48c09000-7fbf48dbe000 r-xp 00000000 08:08 1321013                    /lib/x86_64-linux-gnu/libc-2.15.so
7fbf48dbe000-7fbf48fbe000 ---p 001b5000 08:08 1321013                    /lib/x86_64-linux-gnu/libc-2.15.so
[...]           [...]           [...]           [...]                                   [...]
7fbf491eb000-7fbf491ec000 r--p 00022000 08:08 1321061                    /lib/x86_64-linux-gnu/ld-2.15.so
7fbf491ec000-7fbf491ee000 rw-p 00023000 08:08 1321061                    /lib/x86_64-linux-gnu/ld-2.15.so
7fffc3456000-7fffc3477000 rw-p 00000000 00:00 0                          [stack]
7fffc35ff000-7fffc3600000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Basically, each line in the command's output corresponds to a VMA. The fields in each line are:

start-end permissions offset major:minor inode image

where:

  • start-end: start and end addresses of the VMA.

  • permissions: r (read), w (write), and x (execute). The p (private) and s (shared) flags indicate the type of memory mapping.

  • offset: the offset into the underlying object where the VMA mapping begins.

  • major:minor: the major and minor number pairs of the device holding the file that has been mapped.

  • inode: the inode number of the mapped file.

  • image: the name of the mapped file.

Alternatively, the pmap(1) utility may be used to obtain similar output.

Essentially, each VMA possesses certain properties (e.g. permissions) and kernel operations that are unique among the other VMAs in the process' address space - each VMA has a special rule for the page-fault handlers.

Like ELF object file sections, VMAs do not overlap: just as no byte in an object file resides in more than one section, no allocated (i.e. cached or uncached) page in a process' address space resides in more than one VMA. Each allocated page in the address space of a process is contained in some VMA. An attempt by a process to reference a page that is not contained in any VMA within its address space will result in a segmentation fault and its termination.

Note that dynamically linked programs also include additional VMAs corresponding to the memory mappings of the segments of shared libraries as well as the segments of the program interpreter (i.e. the dynamic linker e.g. ld-linux.so, ld-uClibc.so, etc). Also note that if a program performs an mmap(2) of a device's memory mapped I/O (MMIO) region or Direct Memory Access (DMA) memory, then this will result in the creation of a new VMA in the address space of the process2.

Section-Segment-VMA Mappings

ELF executables - and the resulting processes - on a UNIX/Linux system have a uniform or similar memory layout. The link editor (a.k.a static linker), ld(1), ensures that an executable's loadable code segment always starts at a certain virtual address. The value is architecture dependent. For x86, 0x08048000 (for 32-bit address spaces) and 0x400000 (for 64-bit address spaces). Recall that the virtual addresses of symbols (procedures/functions, objects/variables) in an ELF executable object file are also - generally - their runtime addresses in the process' address space. The loadable code segment is immediately followed by the loadable data segment. At runtime, the stack occupies the highest portion of the process' virtual address space and grows downward. The heap and memory mappings of objects such as shared library segments, the dynamic linker segments, and mapped device MMIO regions are located in the process' address space somewhere in between the data segment and the stack.

Consider the following simple program:

$ cat sec_seg_vma.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdint.h>

static uint32_t bss[1024];                     /* include 4KB in ".bss" */
static uint32_t data[1024] = {1024};           /* include 4KB in ".data" */
static const int8_t rodata[8192] = {"rodata"}; /* include 8KB in ".rodata" */

int main(void)
{
    char cmd[64];
    printf("0x%lx\n", (unsigned long)malloc(8192));
    sprintf(cmd, "cat /proc/%d/maps", getpid());
    system(cmd);
    exit(EXIT_SUCCESS);
}

To simplify illustration, 32-bit code will be considered. In addition, since gcc(1) -- justifiably -- grows a mind of its own when optimization levels are specified, none are included on its command line as it may decide to compile out the unused global variables:

$ gcc -m32 sec_seg_vma.c

Here is the program header information for the loadable segments in the ELF executable:

$ readelf -l a.out

Program Headers:
    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
[...]
    LOAD           0x000000 0x08048000 0x08048000 0x02730 0x02730 R E 0x1000
    LOAD           0x002f14 0x0804bf14 0x0804bf14 0x0112c 0x0214c RW  0x1000
[...]
 Section to Segment mapping:
    Segment Sections...
[...]
     02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
     03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss

Relevant information about the bss, data and rodata object symbols (defined in sec_seg_vma.c) can be obtained via:

$ objdump -t a.out | sort | awk '$6 ~ /bss|data/ && $3 == "O" { print $0 }'
08048620 l     O .rodata    00002000              rodata
0804c040 l     O .data      00001000              data
0804d060 l     O .bss       00001000              bss

where:

  • 1st field gives the symbol's runtime virtual address.

  • 2nd and 3rd fields indicate the symbol binding (l for local) and type (O for object), respectively.

  • 4th and 5th fields indicate the section in which the symbol resides and symbol size (hexadecimal), respectively.

  • 6th field is the symbol's name.

Shown below is the output of a running instance of this program. The VMAs by the C library segments, dynamic linker segments, stack, etc, are excluded:

$ ./a.out 
0x8ffa008
08048000-0804b000 r-xp 00000000 08:08 299509        /tmp/a.out
0804b000-0804c000 r--p 00002000 08:08 299509        /tmp/a.out
0804c000-0804e000 rw-p 00003000 08:08 299509        /tmp/a.out
0804e000-0804f000 rw-p 00000000 00:00 0 
08ffa000-0901d000 rw-p 00000000 00:00 0             [heap]
[...]

Now, the information gathered from the output of the readelf(1) and objdump(1) commands, along with the output of the program instance reveal several interesting things:

  • The loadable code segment contains sections holding machine instructions (e.g. .text, Procedure Linkage Table (PLT) in .plt, and C run-time code in .init/.fini) as well as sections holding certain read-only data including relocation information (.rel.dyn, rel.plt) and dynamic linking information (.interp, .gnu.hash, .dynsym, .dynstr, etc) and string constants in .rodata. This segment is mapped to the 08048000-0804b000 VMA which is 12KB in size courtesy of the 8KB rodata array defined sec_seg_vma.c since the kernel deals in page size granularity.

  • The loadable data segment is responsible for the three consecutive VMAs that immediately follow the code segment VMA:

    • Dynamic linking info in .dynamic, relocated pointer values to external symbols in .got, the first few special slots in .got.plt, the C run-time data (e.g. .ctors, .dtors) and a few more other things get mapped to a separate read-only VMA at 0804b000-0804c000 corresponding to the special PT_GNU_RELRO segment3:

      $ readelf -l a.out
      
      Program Headers:
          Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
      ...
          GNU_RELRO      0x002f14 0x0804bf14 0x0804bf14 0x000ec 0x000ec R   0x1
      ...
       Section to Segment mapping:
        Segment Sections...
      ...
           08     .ctors .dtors .jcr .dynamic .got
      
      $ objdump -t a.out | sort | awk '$1 ~ /^0804b/{ print $0 }'
      0804bf14 l       .ctors 00000000              __init_array_end
      0804bf14 l       .ctors 00000000              __init_array_start
      0804bf14 l    d  .ctors 00000000              .ctors
      0804bf14 l     O .ctors 00000000              __CTOR_LIST__
      0804bf18 l     O .ctors 00000000              __CTOR_END__
      0804bf1c l    d  .dtors 00000000              .dtors
      0804bf1c l     O .dtors 00000000              __DTOR_LIST__
      0804bf20 g     O .dtors 00000000              .hidden __DTOR_END__
      0804bf24 l    d  .jcr   00000000              .jcr
      0804bf24 l     O .jcr   00000000              __JCR_END__
      0804bf24 l     O .jcr   00000000              __JCR_LIST__
      0804bf28 l    d  .dynamic   00000000              .dynamic
      0804bf28 l     O .dynamic   00000000              _DYNAMIC
      0804bff0 l    d  .got   00000000              .got
      0804bff4 l    d  .got.plt   00000000              .got.plt
      0804bff4 l     O .got.plt   00000000              _GLOBAL_OFFSET_TABLE_
      

      It is worth noting how the link editor performs some ABI trickery with respect to the .got.plt section. Entries/slots in this table are associated with PLT function references. Now, symbols in .got are, by default, associated with global data references and are relocated at load-time. As a security measure, .got resides in the special PT_GNU_RELRO segment which is marked read-only as soon as the data symbol relocations complete. On the other hand (on most archs), relocations for PLT function references are deferred until the point in time when the symbols are actually referenced. This is an optimization since PLT function symbol relocations are relatively time costly. With .got.plt, the first few slots (illustrated below at address range 0x0804bff4 - 0x0804bffe), are special and reside in PT_GNU_RELRO. Some of these special entries are populated at load-time (by the dynamic linker) and will include the entry point to the dynamic linker itself. The rest of the slots (0x0804c000 - 0x0804c01d) reside in the read-write part of the data segment to enable run-time relocation updates for function references via .plt:

      Disassembly of section .got.plt:
      
      0804bff4 <_GLOBAL_OFFSET_TABLE_>:
       804bff4:       28 bf 04 08 00 00       sub    %bh,0x804(%edi)
       804bffa:       00 00                   add    %al,(%eax)
       804bffc:       00 00                   add    %al,(%eax)
       804bffe:       00 00                   add    %al,(%eax)
       804c000:       96                      xchg   %eax,%esi
       804c001:       83 04 08 a6             addl   $0xffffffa6,(%eax,%ecx,1)
      [...]       
       804c01d:       84 04 08                test   %al,(%eax,%ecx,1)
      

      Generation of PT_GNU_RELRO is courtesy of the -z relro static linker option (default setting with the Ubuntu 12.04 AMD64 toolchain). If the -z now option is also specified or, alternatively, if the LD_BIND_NOW dynamic linker environment variable is set, then all symbol relocations for both global variables and PLT function references, will be performed at load-time. Generally, with -z now, the entire GOT will be placed in PT_GNU_RELRO4.

    • The read-write part of .got.plt and the .data section form the read-write VMA located at 0804c000-0804e000. In other words, in addition to the partial .got.plt table, this VMA contains the following symbols:

      $ objdump -t a.out | sort | awk '$1 ~ /^0804c/{ print $0 }'
      0804c020 g       .data  00000000              __data_start
      0804c020 l    d  .data  00000000              .data
      0804c020  w      .data  00000000              data_start
      0804c024 g     O .data  00000000              .hidden __dso_handle
      0804c040 l     O .data  00001000              data
      

      Note that this memory area is 8KB in size courtesy of the 4KB data object symbol (defined in sec_seg_vma.c). In other words, since the space taken up by the symbol "spills" into the next virtual page, i.e. 0x0804c040 + 0x1000, and since the kernel deals in page-size granularity during VMA allocation, an 8KB read-write memory area results.

    • The anonymous page mapping at 0804e000-0804f000 is created for .bss. The size of this VMA was determined by the size of the bss object symbol (defined in sec_seg_vma.c). Since this VMA is not mapped to any physical device, the major:minor is 00:00 with a corresponding zero inode. This is a private, copy-on-write (COW) mapping that contains binary zeros only. In effect, the VMA is initialized with all zeros - as expected by .bss (conforming to the C standard). As soon as the process writes into a page in this area, a protection fault is triggered. But since the page was marked COW, the kernel finds an appropriate victim page in physical memory, swaps out the victim page if it is dirty, overwrites the victim page with binary zeros, updates the page table entries and the CPU restarts the faulting instruction upon exception handler return. Note that since no page is swapped in from disk, pages in anonymous mappings are demand-zero pages.

      Since .bss (and hence any symbol defined relative to it, including the bss object) occupies no space in the object file, the values of p_filesz and p_memsz (i.e. FileSiz and MemSiz in readelf -l a.out) of the loadable data segment differ. Also note that the size of the bss object symbol as reported by the objdump -t a.out command indicates its memory size (and this value is used by the operating system for allocation during load time) rather than its size in the object file.

      Also notice that while the address of the .bss section in this example was reported as 0x0804d060 by objdump(1), the load address was offset by a page to start at 0x0804e000 by the preceding 8KB VMA containing .data.

  • Finally, the malloc operation resulted in a 8KB dynamic memory allocation starting at 0x8ffa008 which resides in the heap region at 08ffa000-0901d000. This demand zero heap VMA is allocated dynamically during program runtime and is not related to any of the loadable segments in the ELF object file.

The following figure attempts to summarize section-segment-VMA mappings (not drawn to scale):

ELF: Section-Segment-VMA mapping

Resources and Further Reading

  • Linux Kernel Development, 3rd Edition, Robert Love, 2010. The book is thorough.

  • Understanding The Linux Kernel, 3rd Edition, Daniel P. Bovet, Marco Caseti, 2005, O'Reilly. Somewhat dated but still very useful.

  • How To Write Shared Libraries, Ulrich Drepper (Available Online)

Footnotes

1. Both the static linking phase with ld(1) and the dynamic linking phase with the dynamic linker (e.g. ld-linux.so, ld-uClibc.so, etc). [go back]

2. See mmaping device MMIO and DMA regions for an example of such an instance. [go back]

3. ld(1)'s -z relro switch causes it to move sections which are only modified by relocations onto a separate page in the data segment and emit a new program header entry, PT_GNU_RELRO. This header points the dynamic linker to this special page which will have its write access attribute removed after relocations have been performed. [go back]

4. For example, with the Ubuntu 12.04 AMD64 toolchain, the -z now linker option results in all GOT entries (for both data and function relocations) being located in the .got section in PT_GNU_RELRO i.e. no .got.plt.[go back]