[Development] QTBUG-30440: restricting the SIMD files

Thiago Macieira thiago.macieira at intel.com
Thu Aug 15 00:23:28 CEST 2013


Hello all

I've been running through this problem for a while. I might have finally a 
solution, but I'd like to see if there are other options and which of the 
solutions we should go for.

== Quick medium-sized summary of the problem:

	https://bugreports.qt-project.org/browse/QTBUG-30440

Qt has had a few source files that have been built with extra CPU requirements 
for a few years now. The idea is that we can do better if we detect, at run-
time, that the CPU supports some extra features. So far, this has been 
restricted to gui/painting/qdrawhelper* and gui/image/qimage* and the CPU 
features we've used are ARM Neon, x86 SSE2, SSSE3, and AVX, MIPS DSP.

This has been working for a few years, but it turns out that it has always 
been a ticking time-bomb.

Whenever any of those source files uses an inline function, the compiler must 
choose between inlining or not. If it does *not* inline (such as in a debug 
build), then the compiler emits the out-of-line copy of the full function and 
the linker must merge all of those copies into one single copy. The linker 
does that by choosing any of the copies. It may choose the wrong copy.

== Example from the bug report:

In the ARM build, qdrawhelper_neon.cpp makes use of qRound (an inline 
function). That means qdrawhelper_neon.o in a debug build will have the 
_Z6qRoundf symbol, but so will some 40 other .o files. All of them have the 
same attributes: "hidden" visibility, "weak" binding.

When the linker sees multiple "weak" symbols, it chooses one of them. I 
believe it chose the first one listed in the command-line, which happened to be 
qdrawhelper_neon.o.

That means *all* calls to qRound(float) in QtGui would use the Neon version of 
qRound, for which this particular compiler decided to use Neon instructions 
(disassembly in [1]).

== Possible solutions:

1) Drop simd.prf and the runtime checking

This means dropping the special builds. We'd #include the special files if the 
user is building Qt for a special target.

Most drastic solution. It solves the problem at the expense of having faster 
code paths for when the CPUs have it. For embedded devices, it might be 
acceptable to have specially built Qt versions, but I don't think this flies 
anymore even for Android.

What's more, on x86, the default 32-bit build is just nonsense today. CPUs 
from the past 10 years from both Intel and AMD have had support for SSE2.

2) Drop simd.prf, but keep specialised code and runtime checking

This is a variant of #1. For compilers that support it, we'd #include the 
special files unconditionally and call the specialised code after runtime 
detection is performed.

This assumes that the problem we have with the linker doesn't happen if the 
special builds are present in the same translation unit.

This is for sure possible with ICC, MSVC and GCC 4.9. It is not possible with 
current versions of Clang or GCC.

3) Restrict any CPU-specific code to assembly files

This is what the MIPS DSP solution is doing (MIPS isn't affected by this 
problem).

Drawbacks:
* requires writing assembly by hand
* requires at least two copies of all routines (one in GNU as sources, one for 
  MS assembler)
* cannot benefit from compiler optimisations analyses
* qmake isn't very smart about assembly sources

4) Restrict any CPU-specific code to C or C++ source files with limited #include

This is the solution I prefer (suggested by Shane on IRC). We'd keep the 
special compilers, but we'd drop all the include paths for Qt headers. Those 
sources would be restricted to system headers, which include the support for 
intrinsics.

For C++ sources, we need to ensure no Standard Library headers are included 
(same problem as Qt headers). ISO C headers are fine, since C89 doesn't support 
inlines, and C99 inlines are just plain weird. Most C headers use "static 
inline" for inlines, which is fine.

For that reason, I think we should stick to C, unless there's a very good 
reason why not (like GCC's dispatch feature, which is only supported in C++).


Suggestions?


[1] https://bugreports.qt-project.org/browse/QTBUG-30440?focusedCommentId=200686&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-200686

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.qt-project.org/pipermail/development/attachments/20130814/fe4752f5/attachment.sig>


More information about the Development mailing list