[Development] Updating x86 SIMD support in Qt

Thiago Macieira thiago.macieira at intel.com
Wed Jan 19 04:01:06 CET 2022


For Qt 6.4, I'd like to propose we change the way we detect and enable SIMD 
support. TL;DR:

* Assume all compilers support 5-year-old stuff
* Up the minimum CPU for Linux, Window and macOS/x86
* Fix macOS Universal builds to use the minimum
* Add an option to cmake to choose a minimum matching one of the Linux x86-64 
   ABI revisions
   * Make it easy to build QtCore, QtGui ad Qt3D multi-arch on Linux

Long version:

1) assume all compilers support what we need

Our current tests for compiler support go all the way back to SSE2, which is 
mandatory on x96-64. While testing some changes, I've confirmed that all 
compilers in the CI support x86 CPU features matching the Intel Cannon Lake 
architecture, which is more than we need, except for the QCC compiler missing 
one intrinsic that we can workaround.

I've also found that macOS universal builds, WASM, Android and maybe some more 
are improperly detecting support. Specifically for universal builds, what we 
detect depends on the order in which you specify the architectures. This is 
buggy at a minimum, surprising at best.

I propose we remove the tests for the intrinsics of each individual CPU 
feature. Instead, let's just assume they all have everything up to 2016. This 
will shorten cmake time a little and fix the macOS universal builds. It'll 
also change how 32-bit non-SSE2 builds are selected (see below).

The change https://codereview.qt-project.org/c/qt/qtbase/+/386738 is going in 
this direction but retains a test (all or nothing). I'm proposing now we 
remove the test completely and just assume.

Question:
- the QT_COMPILER_SUPPORTS_xxx macros are in qconfig.h (public config). Do we 
  keep compatibility? We can easily just move them to qprocessordetection.

2) add options to select the target architecture revision

Linux established 3 new revisions of the architecture:
* x86-64 v1 (baseline): SSE2 support
* x86-64 v2: baseline + SSE3, SSSE3, SSE 4
* x86-64 v3: v2 + AVX + AVX2 + FMA + BMI + F16C
* x86-64 v4: v3 + AVX512F + BW + DQ + VL + ER

For i386, we can consider a "v0" of the non-SSE2 original baseline from the 
1980s.

I propose adding a CMake option to make it easy to opt in to one of those. 
Yes, you can just set CMAKE_C(XX)FLAGS_{RELEASE,DEBUG,RELWITHDEBINFO}, so this  
part would be convenience.

For the default, see #4.

3) add a way to have multi-arch glibc-based Linux builds

The revisions also match subdirectory searches by the Linux dynamic linker. 
The subdirectories"x86-64-v2", "x86-64-v3" and "x86-64-v4" are new in glibc 
2.33, but glibc has supported "haswell" (for v3) and "avx512_1" (for v4) for a 
number of years prior to that.

The proposal is to allow the user to specify more than one architecture in the 
list above. We can query the dynamic linker to find out if it supports the new 
names and, if not, use the old ones.

For example, if I specified QT_X86_SUBARCH="v2;v3;v4", it would compile QtCore 
three times. The build products would be:
  lib/libQt6Core.so.6.4.0
  lib/haswell/libQt6Core.so.6.4.0	OR
	lib/glibc-hwcaps/x86-64-v3/libQt6Core.so.6.4.0
  lib/haswell/avx512_1/libQt6Core.so.6.4.0	OR
	lib/glibc-hwcaps/x86-64-v4/libQt6Core.so.6.4.0
with their matching symlinks.

This would apply to only a few select libraries. I'm thinking QtCore, QtGui, 
QtQml and some of the Qt3D libraries.

I don't currently see a need to do this for any plugins and there is no 
standardised way to name them anyway.

This would replace the current "-mno-sse2" option that is required to turn 
i386 32-bit builds from SSE2 support back to the original baseline. For a 32-
bit build, one would use QT_x86_SUBARCH="v0;v1" and get both baseline and the 
SSE2-optimised version.

4) up the defaults from where they are today

Today, your default Qt build will always target the x86-64 baseline[*], 
including for i386, despite as I said no CPU failing to meet the next level 
for 9 years. I'd like to request we up that minimum.

By default, I'd like us to produce x86-64 v2 code, which is SSE4. There are a 
number of optimisations in QtCore and QtGui that get automatically enabled. In 
particular, qstring.cpp does not do runtime detection, so you've been leaving 
performance on the table on your computers, unless you build Qt from source 
yourself and set -march= to match your CPU.

I'm told that Red Hat 9 will increase their minimum to v2, which is why the 
architecture selection features now exist.

This would apply to source and binary builds from qt.io. Android and macOS 
would be unaffected because they already default to this level.

Question:
- iOS simulator builds are x86, but currently only SSE2. Does anyone know if 
raising to SSE4, which *ALL*  64-bit Mac machines support, would be a problem?

5) for glibc-based Linux, add v3 sub-arch by default

I'd like to raise the default on Linux from baseline to v2 *and* add a v3 sub-
arch build, as described by point #3 above.

Device-specific Qt builds (Yocto Project, Boot2Qt) would need to turn this off 
and select a single architecture, if they don't want the extra files.

6) for macOS, raise the minimum to v3 (x86_64h)

macOS has supported an extra architecture called "x86_64h" for some time (the 
"h" stands for "haswell"). Apple ceased offering macOS updates to processors 
without AVX2 back with the Mojave release (10.14) in 2018. Since that's the 
minimum version we require for Qt, it means all Intel-based Macs Qt can run on 
also support this sub-arch.

I'd like to do this for all libraries and by default on binaries from qt.io. 
However, I understand the ARM translation application cannot deal with the AVX 
instructions, so it would fail to run our default binaries for the 
applications that couldn't rebuild as ARM. Is it acceptable to require those 
application developers to rebuild Qt from source?

If not, I'd like to ask we build the same libraries as enabled for Linux 
multi-arch with the additional "x86_64h" architecture (that is, triple 
universal build: "x86_64;x86_64h;aarch64"). I'm assuming here that we can use 
CMake's built-in support for macOS universal builds.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering





More information about the Development mailing list