[Development] QTBUG-30440: restricting the SIMD files
thiago.macieira at intel.com
Thu Aug 15 00:23:28 CEST 2013
I've been running through this problem for a while. I might have finally a
solution, but I'd like to see if there are other options and which of the
solutions we should go for.
== Quick medium-sized summary of the problem:
Qt has had a few source files that have been built with extra CPU requirements
for a few years now. The idea is that we can do better if we detect, at run-
time, that the CPU supports some extra features. So far, this has been
restricted to gui/painting/qdrawhelper* and gui/image/qimage* and the CPU
features we've used are ARM Neon, x86 SSE2, SSSE3, and AVX, MIPS DSP.
This has been working for a few years, but it turns out that it has always
been a ticking time-bomb.
Whenever any of those source files uses an inline function, the compiler must
choose between inlining or not. If it does *not* inline (such as in a debug
build), then the compiler emits the out-of-line copy of the full function and
the linker must merge all of those copies into one single copy. The linker
does that by choosing any of the copies. It may choose the wrong copy.
== Example from the bug report:
In the ARM build, qdrawhelper_neon.cpp makes use of qRound (an inline
function). That means qdrawhelper_neon.o in a debug build will have the
_Z6qRoundf symbol, but so will some 40 other .o files. All of them have the
same attributes: "hidden" visibility, "weak" binding.
When the linker sees multiple "weak" symbols, it chooses one of them. I
believe it chose the first one listed in the command-line, which happened to be
That means *all* calls to qRound(float) in QtGui would use the Neon version of
qRound, for which this particular compiler decided to use Neon instructions
(disassembly in ).
== Possible solutions:
1) Drop simd.prf and the runtime checking
This means dropping the special builds. We'd #include the special files if the
user is building Qt for a special target.
Most drastic solution. It solves the problem at the expense of having faster
code paths for when the CPUs have it. For embedded devices, it might be
acceptable to have specially built Qt versions, but I don't think this flies
anymore even for Android.
What's more, on x86, the default 32-bit build is just nonsense today. CPUs
from the past 10 years from both Intel and AMD have had support for SSE2.
2) Drop simd.prf, but keep specialised code and runtime checking
This is a variant of #1. For compilers that support it, we'd #include the
special files unconditionally and call the specialised code after runtime
detection is performed.
This assumes that the problem we have with the linker doesn't happen if the
special builds are present in the same translation unit.
This is for sure possible with ICC, MSVC and GCC 4.9. It is not possible with
current versions of Clang or GCC.
3) Restrict any CPU-specific code to assembly files
This is what the MIPS DSP solution is doing (MIPS isn't affected by this
* requires writing assembly by hand
* requires at least two copies of all routines (one in GNU as sources, one for
* cannot benefit from compiler optimisations analyses
* qmake isn't very smart about assembly sources
4) Restrict any CPU-specific code to C or C++ source files with limited #include
This is the solution I prefer (suggested by Shane on IRC). We'd keep the
special compilers, but we'd drop all the include paths for Qt headers. Those
sources would be restricted to system headers, which include the support for
For C++ sources, we need to ensure no Standard Library headers are included
(same problem as Qt headers). ISO C headers are fine, since C89 doesn't support
inlines, and C99 inlines are just plain weird. Most C headers use "static
inline" for inlines, which is fine.
For that reason, I think we should stick to C, unless there's a very good
reason why not (like GCC's dispatch feature, which is only supported in C++).
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 190 bytes
Desc: This is a digitally signed message part.
More information about the Development