GCC Linker
GCC Linker details
Some details on how C code variables are managed by the compiler and where they end up in memory - specifically on embedded systems.
example C code
1 . static int gz = 0; 2 . static int gy = 2; 3 . static char *gstr = "Hello"; 4 . extern int eint; 5 . 6 . void main(void) 7 . { 8 . char *lstr = "World"; 9 . int lx=1; 10 . eint = 2; 11 . }
First of all we just compile this as an intermediate object file: gcc -c main.c
readelf -s (annotated)
arm-none-eabi-readelf -s main.o Num: Value Size Type Bind Vis Ndx Name 2: 00000000 0 SECTION LOCAL DEFAULT 1 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 SECTION LOCAL DEFAULT 5 9: 00000000 0 SECTION LOCAL DEFAULT 6 14: 00000000 0 SECTION LOCAL DEFAULT 7 15: 00000000 0 SECTION LOCAL DEFAULT 8 5: 00000000 0 NOTYPE LOCAL DEFAULT 5 $d # [.bss] 6: 00000000 0 NOTYPE LOCAL DEFAULT 5 gz # [.bss] static int gz = 0; 7: 00000000 0 NOTYPE LOCAL DEFAULT 3 $d # [.data] offset 0 10: 00000000 0 NOTYPE LOCAL DEFAULT 6 $d # [.rodata] offset 0 12: 00000000 0 NOTYPE LOCAL DEFAULT 1 $t # [.text] offset 0 ?? 13: 00000018 0 NOTYPE LOCAL DEFAULT 1 $d # [.text] offset 18 (dec?) ?? 17: 00000000 0 NOTYPE GLOBAL DEFAULT UND eint # [] extern int eint; 8: 00000000 4 OBJECT LOCAL DEFAULT 3 gy # [.data] static int gy = 2; 11: 00000004 4 OBJECT LOCAL DEFAULT 3 gstr # [.data] static char *gstr = "Hello"; 16: 00000001 28 FUNC GLOBAL DEFAULT 1 main # [.text]
Symbol table contains many different types of item including the filename
Symbols
arm-none-eabi-readelf -S main.o There are 12 section headers, starting at offset 0x164: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 00001c 00 AX 0 0 4 [ 2] .rel.text REL 00000000 000474 000008 08 10 1 4 [ 3] .data PROGBITS 00000000 000050 000008 00 WA 0 0 4 [ 4] .rel.data REL 00000000 00047c 000008 08 10 3 4 [ 5] .bss NOBITS 00000000 000058 000004 00 WA 0 0 4 [ 6] .rodata PROGBITS 00000000 000058 000010 00 A 0 0 4 [ 7] .comment PROGBITS 00000000 000068 000071 01 MS 0 0 1 [ 8] .ARM.attributes ARM_ATTRIBUTES 00000000 0000d9 000033 00 0 0 1 [ 9] .shstrtab STRTAB 00000000 00010c 000055 00 0 0 1 [10] .symtab SYMTAB 00000000 000344 000110 10 11 16 4 [11] .strtab STRTAB 00000000 000454 00001e 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
man elf provides most of the info
Nr Name Type Addr Off Size ES Flg Lk Inf Al
- Section Number
- Section Name
- Section Type
- Section Address Load address of section if it occupies memory during process execution (eg ALLOC)
- Section Offset Byte offset of section in file
- Section Size Section size in file (with the exception of NOBITS that occupies no space in file)
- Entity Size where section contains a table of fixed size entities.
- FLaGs section flags
- LinK Section header table index link (section specific meaning)
- INFo Section dependent INFO (section specific meaning)
- address ALignment address byte alignment constraints.
Flags
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
Flag meanings:
- WRITE - data needs to be writeable during program execution
- ALLOC - section occupies space during program execution
- EXECUTABLE - section contains executable instructions (ie code).
- MERGE - section can be merged with other similar sections !?
- STRINGS - section contains strings (how different from a string table ??)
QUESTIONS: Why are all Alloc sections addr = 0 !? Meaning of Info and Lnk, why is .comment PROGBITS and not STRTAB ?
File Sections
The main.o file contains 12 sections. These are show below (not in order)Section 0 - NULL
Index 0 is always a NULL section:Nr Name Type Addr Off Size ES Flg Lk Inf Al 0 NULL 00000000 000000 000000 00 0 0 0
Section 9 - Section Names String Table
Contains the section name strings:
Nr Name Type Addr Off Size ES Flg Lk Inf Al 9 .shstrtab STRTAB 00000000 00010c 000055 00 0 0 1
readelf -x9
arm-none-eabi-readelf -x9 main.o Hex dump of section '.shstrtab': 0x00000000 002e7379 6d746162 002e7374 72746162 ..symtab..strtab 0x00000010 002e7368 73747274 6162002e 72656c2e ..shstrtab..rel. 0x00000020 74657874 002e7265 6c2e6461 7461002e text..rel.data.. 0x00000030 62737300 2e726f64 61746100 2e636f6d bss..rodata..com 0x00000040 6d656e74 002e4152 4d2e6174 74726962 ment..ARM.attrib 0x00000050 75746573 00 utes.
readelf -p9
arm-none-eabi-readelf -p9 main.o String dump of section '.shstrtab': [ 1] .symtab [ 9] .strtab [ 11] .shstrtab [ 1b] .rel.text [ 25] .rel.data [ 2f] .bss [ 34] .rodata [ 3c] .comment [ 45] .ARM.attributes
Note: strings are zero terminated and indexed by their offset. Offset 0 is the null byte by definition (and x 0 is used by definition for strings that have no/null name
Sections 1,2 Code
Program code (ALLOC, EXECUTE):Nr Name Type Addr Off Size ES Flg Lk Inf Al 1 .text PROGBITS 00000000 000034 00001c 00 AX 0 0 4 2 .rel.text REL 00000000 000474 000008 08 10 1 4
Link: section header index of associated symbol table
Info: section header index of the section to be relocated
readelf -x2
arm-none-eabi-readelf -x2 main.o Hex dump of section '.rel.text': 0x00000000 20000000 02090000 24000000 02110000 .......$.......
readelf -r
arm-none-eabi-readelf -r main.o Relocation section '.rel.text' at offset 0x494 contains 2 entries: Offset Info Type Sym.Value Sym. Name 00000020 00000902 R_ARM_ABS32 00000000 .rodata [ 9: .shstrtab] offset 20 00000024 00001102 R_ARM_ABS32 00000000 eint [11: .strtab] offset 24 Info: 0000sstt ss - symbol table index tt - type
readelf -x2
arm-none-eabi-readelf -x2 main.o Hex dump of section '.rel.text': 0x00000000 18000000 02090000 ........
Sections 3,4 - Data
Initialised data (WRITE, ALLOC):Nr Name Type Addr Off Size ES Flg Lk Inf Al 3 .data PROGBITS 00000000 000050 000008 00 WA 0 0 4 4 .rel.data REL 00000000 00047c 000008 08 10 3 4
Link: section header index of associated symbol table
Info: section header index of the section to be relocated
readelf -x3
arm-none-eabi-readelf -x3 main.o Hex dump of section '.data': NOTE: This section has relocations against it, but these have NOT been applied to this dump. 0x00000000 02000000 00000000 ........
readelf -xr
arm-none-eabi-readelf -x4 main.o Hex dump of section '.rel.data': 0x00000000 04000000 02090000 ........ Relocation section '.rel.data' at offset 0x4a4 contains 1 entries: Offset Info Type Sym.Value Sym. Name 00000004 00000902 R_ARM_ABS32 00000000 .rodata [ 9: .shstrtab]
Section 5 - BSS
Uninitialised data (WRITE, ALLOC): (Note same as PROGBITS but uses no space in file)Nr Name Type Addr Off Size ES Flg Lk Inf Al 5 .bss NOBITS 00000000 000058 000004 00 WA 0 0 4
Section 6 - Read-only Data
Read-only data (ALLOC)Nr Name Type Addr Off Size ES Flg Lk Inf Al 6 .rodata PROGBITS 00000000 000058 000010 00 A 0 0 4
readelf -x6
arm-none-eabi-readelf -x6 main.o Hex dump of section '.rodata': 0x00000000 48656c6c 6f000000 576f726c 64000000 Hello...World...
Section 7 - Comments
Comments contains version control information (MERGE, STRINGS):Nr Name Type Addr Off Size ES Flg Lk Inf Al 7 .comment PROGBITS 00000000 000068 000071 01 MS 0 0 1
arm-none-eabi-readelf -p7 main.o String dump of section '.comment': 1 GCC: (GNU Tools for ARM Embedded Processors) 4.8.4 20140526 (release) ARM/embedded-4_8-branch revision 211358
Section 10 - Symbol Table
Section 10 contains the symbol tableNr Name Type Addr Off Size ES Flg Lk Inf Al 10 .symtab SYMTAB 00000000 000344 000110 10 11 16 4
Link: section header index of associated string table
Info: last local symbol (STB_LOCAL) in symbol table index PLUS ONE.
readelf -s
arm-none-eabi-readelf -s main.o Symbol table '.symtab' contains 17 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FILE LOCAL DEFAULT ABS main.c 2: 00000000 0 SECTION LOCAL DEFAULT 1 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 SECTION LOCAL DEFAULT 5 5: 00000000 0 NOTYPE LOCAL DEFAULT 5 $d 6: 00000000 0 NOTYPE LOCAL DEFAULT 5 gz 7: 00000000 0 NOTYPE LOCAL DEFAULT 3 $d 8: 00000000 4 OBJECT LOCAL DEFAULT 3 gy 9: 00000000 0 SECTION LOCAL DEFAULT 6 10: 00000000 0 NOTYPE LOCAL DEFAULT 6 $d 11: 00000004 4 OBJECT LOCAL DEFAULT 3 gstr 12: 00000000 0 NOTYPE LOCAL DEFAULT 1 $t 13: 00000018 0 NOTYPE LOCAL DEFAULT 1 $d 14: 00000000 0 SECTION LOCAL DEFAULT 7 15: 00000000 0 SECTION LOCAL DEFAULT 8 16: 00000001 28 FUNC GLOBAL DEFAULT 1 main 17: 00000000 0 NOTYPE GLOBAL DEFAULT UND eint
Like section table index 0, symbol table Index 0 is undefined/reserved.
- Num index
- Value symbol value (meaning is symbol type specific).
- For relocatable files (those with Ndx > 0 above) value is the symbol byte offset within the specified section (Ndx).
- In executable/shared objects value holds the virtual address (section number Ndx ignored).
- Size symbol size (bytes)
- Type
- NOTYPE - unspecified
- OBJECT - data object (variable, array etc)
- FUNC - function or similar
- SECTION - related to a section - typically used for relocation management
- FILE - name of source file
- LOPROC to HIPROC - reserved for processor specific information
- Bind Symbol binding/visibility
- LOCAL - visibility restricted to compilation unit
- GLOBAL - visible to all compilation units (multiple definitions forbidden)
- WEAK - will not cause error if another global symbol exists with the same name. weak definition is ignored in the case of multiple definitions. Unresolved weak singles have a value of zero.
- LOPROC to HIPROC - reserved for processor specific information
- Visibility
- DEFAULT - specified by binding semantics.
- Ndx section index where value refers to a section specific location/offset. Special section indices:
- UNDef - symbol is undefined in this compilation unit
- COMMON - symbol labels an unallocated common block. Value is alignment constraint. size is block size
- ABS - symbol value is absolute
- Name symbol name (internally index into the string table)||
Section 11 - String Table
Nr Name Type Addr Off Size ES Flg Lk Inf Al 11 .strtab STRTAB 00000000 000454 00001e 00 0 0 1
readelf -p11
arm-none-eabi-readelf -p11 main.o String dump of section '.strtab': [ 1] main.c [ 8] $d [ b] gz [ e] gy [ 11] gstr [ 16] $t [ 19] main
Section 8 - ARM Attributes
Nr Name Type Addr Off Size ES Flg Lk Inf Al 8 .ARM.attributes ARM_ATTRIBUTES 00000000 0000d9 000033 00 0 0 1
readelf -x8
arm-none-eabi-readelf -x8 main.o Hex dump of section '.ARM.attributes': 0x00000000 41320000 00616561 62690001 28000000 A2...aeabi..(... 0x00000010 05436f72 7465782d 4d330006 0a074d09 .Cortex-M3....M. 0x00000020 02120414 01150117 03180119 011a011e ................ 0x00000030 062201 .".
- + : A leading plus sign indicates that this word must be present in every object returned.
- - : A leading minus sign indicates that this word must not be present in any row returned.
- By default (when neither plus nor minus is specified) the word is optional, but the object that contain it will be rated higher.
- < > : These two operators are used to change a word's contribution to the relevance value that is assigned to a row.
- ( ) : Parentheses are used to group words into subexpressions.
- ~ : A leading tilde acts as a negation operator, causing the word's contribution to the object relevance to be negative. It's useful for marking noise words. An object that contains such a word will be rated lower than others, but will not be excluded altogether, as it would be with the - operator.
- * : An asterisk is the truncation operator. Unlike the other operators, it should be appended to the word, not prepended.
- " : The phrase, that is enclosed in double quotes ", matches only objects that contain this phrase literally, as it was typed.