[Development] Update: State of x86 SIMD in Qt
Thiago Macieira
thiago.macieira at intel.com
Mon Jan 2 14:10:57 CET 2012
Here's how my work currently stands:
On Thursday, 29 de December de 2011 17.00.09, Thiago Macieira wrote:
> 1) Drop the MMX code and the 3dNow! extensions now
Done. Also dropped the detection mechanism in qsimd.cpp, configure and
configure.exe. I also updated the macros to be simpler to use. For example, for
SSE2 support, we now should have:
QT_COMPILER_SUPPORTS_SSE2 - the compiler does support the feature, but it
doesn't mean the feature is enabled
__SSE2__ - the feature is enabled by the compiler
Since MSVC doesn't set the latter flag, I made it do that. Testing pending.
I also made qsimd.cpp record which compiler settings were enabled at Qt build
time and abort with qFatal if the CPU is missing them. This sounds bad, but it
actually isn't: if Qt was compiled with SSE4.1 support enabled, then
qstring.cpp would use it and your application would crash very, very early on
anyway with SIGILL. This just makes the application quit with a nicer error
message.
Configure prints on my machine now (gcc -march=core2 -mtune=corei7-avx):
SSE2/SSE3/SSSE3......... yes/yes/yes
SSE4.1/SSE4.2........... yes/yes
AVX/AVX2................ yes/no
Default CPU features.... cx16 mmx sse sse2 sse3 ssse3
ICC makes the AVX2 setting go to "yes", as will the update to GCC 4.7. The ARM
build says:
SSE2/SSE3/SSSE3......... no/no/no
SSE4.1/SSE4.2........... no/no
AVX/AVX2................ no/no
iWMMXt support ......... no
NEON support ........... yes
Default CPU features.... neon
> 2) Compile qdrawhelper.cpp once, normally, no change to compiler flags
Done. If __SSE2__ is set, then qdrawhelper_plain.cpp will *not* compile the
helpers. It will simply call qInitDrawhelperSse2().
> 3) If the compiler flags from the user do not already include -msse2,
> compile it *again* with -msse2; the same applies for -mfpu=neon on ARM.
Done.
I also made it *not* add -msse2 if SSE2 was already enabled by the user.
That's the case for everyone on x86-64, but it's more important for cases like
enabling SSE3, SSSE3 (that was MeeGo) or more in the compiler flags. By doing
nothing, we compile the helpers with those settings. If we passed -msse2, we
would in fact *downgrade* the support.
> 4) If the compiler flags don't already include -mavx, do it *again* with
> -mavx.
Done.
> 5) Select a few operations that might benefit from SSE3 or SSSE3
> implementations on top of the SSE2 ones (my guess is it's only
> blend_argb32_on_argb32)
Done. If qdrawhelper_sse2.cpp is compiler with __SSSE3__, it unconditionally
uses qdrawhelper_ssse3.cpp's functions and will not compile the SSE2 ones.
That also helps in AVX mode.
Overall:
The code is compiling fine with GCC on x86, x86-64; ICC on x86-64 and I have a
linker error on ARM with Neon.
Then I need to run the regression tests and investigate further improvements.
I'm thinking of merging in qmemfunctions.cpp and qblendfunctions.cpp.
I haven't done any benchmarks yet. In terms of library size, I expect all
libraries on x86 or x86-64 to increase in size, unless you disable some of the
builds. For ARM, if your compiler flags *already* have -mfpu=neon, then there
should be no change in size nor big performance gains, except those due to
better compiler optimisation.
If your flags don't have -mfpu=neon, then the library size will increase unless
you also specify -no-neon.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Intel Sweden AB - Registration Number: 556189-6027
Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.qt-project.org/pipermail/development/attachments/20120102/450009e1/attachment.sig>
More information about the Development
mailing list