[Interest] building Qt 5.9 on Linux - clang or GCC?
Thiago Macieira
thiago.macieira at intel.com
Mon Dec 18 22:12:51 CET 2017
On Monday, 18 December 2017 11:55:42 PST René J. V. Bertin wrote:
> Thiago Macieira wrote:
> > It doesn't, because the debug information is not loaded in the first
> > place.
> > When using readelf, note how the "A" flag is missing for those sections.
>
> So it has to skip certain, possibly considerable parts of the file while
> loading it, rather than simply doing some efficient operation to copy the
> whole file into memory. That should affect load times somewhat, no?
No, that's not how ELF works.
First of all, the dynamic linker doesn't actually read the section table. It
reads the segment table, found in the ELF program headers (readelf -l):
$ readelf -l /lib/libm.so.6
Elf file type is DYN (Shared object file)
Entry point 0x6200
There are 7 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00000000 0x00000000 0xf9264 0xf9264 R E 0x1000
LOAD 0x0f9eb4 0x000faeb4 0x000faeb4 0x003cc 0x003d4 RW 0x1000
DYNAMIC 0x0f9ebc 0x000faebc 0x000faebc 0x00118 0x00118 RW 0x4
NOTE 0x000114 0x00000114 0x00000114 0x00044 0x00044 R 0x4
GNU_EH_FRAME 0x0dda54 0x000dda54 0x000dda54 0x016bc 0x016bc R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
GNU_RELRO 0x0f9eb4 0x000faeb4 0x000faeb4 0x0014c 0x0014c R 0x1
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr
.gnu.version .gnu.version_d .gnu.version_r .rel.dyn .rel.plt .init .plt
.plt.got .text .fini .rodata .eh_frame_hdr .eh_frame .hash
01 .init_array .fini_array .dynamic .got .got.plt .data .bss
02 .dynamic
03 .note.gnu.build-id .note.ABI-tag
04 .eh_frame_hdr
05
06 .init_array .fini_array .dynamic .got
(I've pasted libm only for column width, try it on a Qt library with debugging
list yourself)
Note the LOAD commands. That's what matters to the dynamic linker and what it
will load. Note also how the debug sections are not in the first or second
entries of the section-to-segment mapping list. That means the debugging
sections are beyond the load regions and won't be present in memory.
Second, file binary is loaded via mmap(), which means the actual file contents
aren't faulted into memory unless needed or unless there's an madvise() system
call to tell the kernel to load. So even if the debug sections included in the
LOAD regions, they wouldn't occupy core memory nor would affect the load time,
unless something actually tried to access them.
> > One more reason to use GCC. It only builds once, even under LTO, unless
> > you
> > specifically ask for the fat LTO objects.
>
> Yet even with GCC the build times and memory requirements are larger with
> LTO than without. How can it not do certain things twice?
The build time has nothing to do with doing things twice. It has to do with
the amount of work.
Even with LTO, the compiler must start and process each translation unit. The
difference between LTO and a normal build is that in the former, it needs to
do less work since it doesn't actually run the optimiser. It just needs to
dump some intermediary information.
The difference is with the linker. In a regular build, even with -Wl,-O1, the
linker does very little and its job is to basically concatenate sections of
each input file. In an LTO build, the linker calls the compiler again and that
will need to reload all the intermediary information and perform the
optimisation, now with a much larger dataset.
In my experience, a thin LTO build is actually faster (and produces better
code) than an equivalent non-LTO build, but that doesn't apply to all cases.
Regular, optimised (-O3 -g1) build of qmake:
Time to build: 268,00s user 11,28s system 368% cpu 1:15,87 total
Total object sizes (kB): 69596
Binary size (after stripping):
text data bss dec hex filename
3008485 2080 6361 3016926 2e08de ../bin/qmake
Simple LTO build (-O3 -g1 -flto -fno-fat-lto-objects, linking* -flto=4):
Time: 208,01s user 10,36s system 365% cpu 59,731 total
Total object sizes: 32476
Binary:
text data bss dec hex filename
2427597 1972 6217 2435786 252aca ../bin/qmake
Fat LTO build (-O3 -g1 -flto -ffat-lto-objects, linking* -flto=4):
Time: 371,19s user 13,49s system 369% cpu 1:44,11 total
Sizes: 101928
Binary:
text data bss dec hex filename
2427597 1972 6217 2435786 252aca ../bin/qmake
*: Don't forget to pass -O3 -g1 to the linker too, otherwise the LTO step
won't optimise!
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
More information about the Interest
mailing list