[Interest] building Qt 4.8.7 with gcc 5 and link-time optimisation on Linux
René J. V. Bertin
rjvbertin at gmail.com
Sat Jul 25 09:50:06 CEST 2015
Thiago Macieira wrote:
>> > For Clang, QT_COMPILER_SUPPORTS_HERE(x) expands to a check defined(__x__)
>> > [in this case, if __SSE4_2__ is defined].
...
>
> __SSE4_2__ isn't defined anywhere you'll see. It's pre-defined by the
> compiler.
Well yes of course, I know that. I meant the declaration that makes
QT_COMPILER_SUPPORTS_HERE(x) expand to a direct check of __SSE4_2__ .
>> I'm actually a bit surprised that either the compiler finds nothing in the
>> Qt code to auto-vectorise with SSE4 instructions, or that that doesn't lead
>> to issues in the linker with LTO.
>
> The latter. As I said, my guess is this is a compiler bug because it obviously
> has SSE4.2 enabled.
Well, have you checked that auto-vectorisation indeed doesn't use SSE4
instructions? In my experience it is in fact not very common; I haven't run into
related issues frequently with code built with -march=native on one of those VMs
I referred to that don't support SSE4 despite virtualising a capable CPU.
> Note that GCC only auto-vectorises on -O3. I don't know about Clang.
I'm quite sure it's the same. Otherwise I'd have continued my habit of using the
equivalent of -ftree-vectorize :)
> Because it can't be disabled in the compiler. It *always* generates those
> instructions.
...
> That's incorrect. SSE4.2 is enabled in your compiler because you're using
> Apple's build of Clang.
I don't think it's as black-and-white as that ...
>> Also, note that code that has to run on VMs may need to deactivate SSE4
>> support. There is at least 1 virtualisation solution that does not expose
>> the instruction set.
>
> No VM will ever do that and run OS X code.
Try VirtualBox. I've still run into issues not long ago that forced me to build
with -march=core2 instead of -march=native, on a host with a recent i5 CPU (and
come to think of it, with Qt 5.4). I only *had* access to the VM so I don't know
exactly what instruction sets the host supported, but I don't think this had
anything to do with more recent instruction sets.
BTW, trying your expression on OS X 10.9 :
%> clang -dM -E -xc /dev/null | fgrep -i SSE
#define __SSE2_MATH__ 1
#define __SSE2__ 1
#define __SSE3__ 1
#define __SSE_MATH__ 1
#define __SSE__ 1
#define __SSSE3__ 1
%> clang -march=native -v -dM -E -xc /dev/null | egrep -i 'SSE|AVX|MMX|MUL'
Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix
"/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang"
-cc1 -triple x86_64-apple-macosx10.9.0 -E -disable-free -disable-llvm-verifier -
main-file-name null -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-
verbose -munwind-tables -target-cpu corei7-avx -target-linker-version 241.9 -v -
resource-dir
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/6.0
-fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 117 -stack-
protector 1 -mstackrealign -fblocks -fobjc-runtime=macosx-10.9.0 -fencode-
extended-block-signature -fdiagnostics-show-option -fcolor-diagnostics -
vectorize-slp -dM -o - -x c /dev/null
clang -cc1 version 6.0 based upon LLVM 3.5svn default target x86_64-apple-
darwin13.4.0
[...]
#define __AVX__ 1
#define __MMX__ 1
#define __PCLMUL__ 1
#define __SSE2_MATH__ 1
#define __SSE2__ 1
#define __SSE3__ 1
#define __SSE4_1__ 1
#define __SSE4_2__ 1
#define __SSE_MATH__ 1
#define __SSE__ 1
#define __SSSE3__ 1
%> clang -march=native -mno-sse4.1 -dM -E -xc /dev/null | fgrep -i SSE
#define __SSE2_MATH__ 1
#define __SSE2__ 1
#define __SSE3__ 1
#define __SSE_MATH__ 1
#define __SSE__ 1
#define __SSSE3__ 1
R.
More information about the Interest
mailing list