[Interest] building Qt 4.8.7 with gcc 5 and link-time optimisation on Linux

Thu Jul 23 10:31:01 CEST 2015

Thiago Macieira wrote:

> That won't work if you already have a build. The propagation only works if
> there are no subdirectory Makefiles yet.
> 
> If you need to regenerate all the Makefiles, add -r (recursive):
> 
> qmake -r -config ltcg $srcdir

Actually, I did, because without -r nothing happened.
And actually, I'm now fooling around with a project that just builds qtbase 
(5.4.2), and it seems that I need -r even immediately after calling configure.

> That's also why Qt 4's configure had the -fast option. Which wasn't the
> default. I don't know why anyone would intentionally choose a slow build with
> no further benefits...

? If -fast just let configure skip the recursive qmake call, what difference did 
it really make? In the end you'd be doing all those qmake calls anyway...

But in that register: why is qmake itself not built in parallel? I've been 
patching the configure script to "fix" that, and it works fine.

> That's qhash.cpp (the only place where we use _mm_crc32_xxx()).
> 
> This sounds like qhash.cpp was compiled with -march=native but *linked*
> without. Can you confirm that you see the compiler options passed on the
> linker command-line (-O2 -march=native, etc.)?

Hmm, I'm checking that now, but it seems you're right (the packaging scripts I 
was using here apparently don't propagate those optimisation options set via the 
commandline to the linker flags).

> Note I have not successfully compiled with LTO with Clang for a while, because
> the linker plugin somehow doesn't get loaded or refuses to understand the LLVM
> bytestream. I only test GCC LTO.

My own experience with Clang LTO isn't exactly positive either. I thought I'd 
try the ltcg config after I read that it's supposed to take care of everything, 
which I may not have done properly myself. We'll see if and how the build I just 
launched completes.

> The compilation itself is faster because there's no code generation for GCC
> (due to -fno-fat-lto-objects). I don't remember whether Clang -flto produces

Hmm, I forced fat lto objects in the Qt 4.8.7 build on Linux, just to be sure (I 
don't want to end up with binaries that force me to use LTO for everything all 
the time).

> LLVM and code or just LLVM bytecode. I know ICC doesn't, but for me it takes
> 30 minutes to link QtCore and an infinite amount of time for QtGui (it gets
> OOM-killed due to swap exhaustion before it finishes).

With Qt 5 and ICC? My linux netbook is now linking QtWebkit, and intermittently 
responsive because of that (but the process appears to be less than 1.5Gb which 
means I'm hardly using any swap at all).

> Unless you're me, you don't want to do this.

Do I have any reason to be you? :)

> I only have evidence for other projects, where LTO did have noticeable runtime
> performance effect.

I certainly hope this is going to pay off on my slow(er) machines, as well as 
during periods of high CPU loads. Any reduction in GUI (and middleware) overhead 
is a gain, IMHO, esp. if it can be obtained with build options.

BTW: OS X's -mdynamic-no-pic option used to give me an approx. 15% gain, which I 
think is impressive for a single option that doesn't even increase compile time 
(and I was equally impressed with the Shark tool that pointed out to me that I 
should be using it). That was on a 32bit PPC machine; I haven't clocked its 
impact on Intel architecture.
In 64bit mode it can be used systematically, even for shared libraries. Does Qt 
use it?

> better able to eliminate dead code. That's often the result of the -fwhole-
> program part of -flto.

GCC's documentation (man page) isn't really clear on -fwhole-program and when it 
applies or should be used. Apparently I should understand that the linker plugin 
knows when to apply it?

Thanks again,
R.