[Development] QTBUG-30440: restricting the SIMD files

Thu Aug 15 06:34:53 CEST 2013

On quinta-feira, 15 de agosto de 2013 14:20:31, Christian Gagneraud wrote:
> > 1) Drop simd.prf and the runtime checking
> > 
> > This means dropping the special builds. We'd #include the special files if
> > the user is building Qt for a special target.
> 
> Can't you get rid of these special files, and build the whole Qt with
> the same flags? Either "generic" flags or "optimised" flags.

Hello Christian

Yes, we could do that. In fact, that's how I build Qt for myself, with my own 
mkspec that has -march=corei7-avx. With a pending change that I never got to 
clean up and submit, it won't build the helpers that are worse than AVX. But 
as I said in the very next paragraph, it's not a generic solution:

> > Most drastic solution. It solves the problem at the expense of having
> > faster code paths for when the CPUs have it. For embedded devices, it
> > might be acceptable to have specially built Qt versions, but I don't
> > think this flies anymore even for Android.
> 
> Could you give a bit more details about the Android case? Is it related
> with the fact that Android apps have to be somehow SoC agnostic?

Yes, that's about it. And remember that this also applies to desktop: standard 
32-bit build on Windows and Linux still compiles code that runs on i386, and 
standard 64-bit builds don't use anything beyond SSE2. We'd like to use SSSE3, 
AVX and AVX2 if possible; even SSE2 on 32-bit builds. But we can't: AVX has 
been supported in CPUs for 2 years only, SSSE3 is supported on AMD CPUs since 
2011 too.

> > What's more, on x86, the default 32-bit build is just nonsense today. CPUs
> > from the past 10 years from both Intel and AMD have had support for SSE2.
> 
> I've seen recently that now gcc have an option for this kind of
> auto-optimisation, can't find any source, but basically gcc
> automatically select CPU extension by looking at the CPU it is running
> on. I saw that in a x86/SSE* context.

GCC 4.8 has this support, it's called "Function multiversioning" and it's 
supported only in C++. See http://gcc.gnu.org/gcc-4.8/changes.html.

But as I pointed out to the GCC devs, the feature is useless without the 
ability to unconditionally #include the intrinsics headers. That's coming in 
GCC 4.9.

Either way, the requirements are for too new versions of GCC and they do not 
support Clang.

> > 4) Restrict any CPU-specific code to C or C++ source files with limited
> > #include
> > 
> > This is the solution I prefer (suggested by Shane on IRC). We'd keep the
> > special compilers, but we'd drop all the include paths for Qt headers.
> > Those sources would be restricted to system headers, which include the
> > support for intrinsics.
> 
> Does that mean that the ticking time-bomb will still be there somehow?

The effect is much reduced for C code, see below why.

> I didn't understand all the low-levels details, this is certainly why
> solution #1 sounds way more simple to me. The assembler code doesn't
> look weird for the electronic engineer I am, it is still legion in the
> wild, and the "technology" selection is made at run time, this is maybe
> what would be nice to have for Android (If I understood the android case
> correctly)

The problem is that the compiler is better at generating assembly code than we 
are. Computers are too complex today for us to write assembly code. The best 
we could do would be to write C code with intrinsics, compile and disassemble 
it back,.

And there are two more disadvantages with pure asm:
1) we need different source files for MSVC and for the GCC world
  (unless we want to use a major hack like OpenSSL's "perlsembly")
2) we don't get the optimisation benefits from improvements in the compiler, 
   including better instruction scheduling for different CPU families or
   revisions.

> > For C++ sources, we need to ensure no Standard Library headers are
> > included
> > (same problem as Qt headers). ISO C headers are fine, since C89 doesn't
> > support inlines, and C99 inlines are just plain weird. Most C headers use
> > "static inline" for inlines, which is fine.

That's why there's no ticking time bomb for C. System C headers only have 
"static inline" inline functions, which don't pose a problem for us because 
they're static.

> Would it be possible to turn off any optimisation in the Qt build system
> and let the distro/tools people select the optimum (cross) gcc flags for
> their target (an ARM SoC in my case) *without* having to heavily patch Qt.

Yes. That's done. It's been there from Qt 4.2 or 4.3 or so. It's definitely the 
case for 5.x. Unless you pass -no-avx -no-ssse3 -no-sse2 / -no-neon to the 
build, however, Qt will build those other helpers and try to use them if the 
CPU supports it.

> Basically the build flags are controlled *only* by a specialised mkspec,
> and there's no *_neon.cpp stuff in Qt.

That's the -no-neon flag.

> I'm not saying Qt build system performs badly in this regard, I'm just
> saying that it is not unusual to "heavily" patch Qt to manage CPU/GPU
> optimisations and cross-compilation issues.

I'm saying that it should be unusual. We're not aware of any issues relating 
to cross-compilation or optimisations that require patching.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.qt-project.org/pipermail/development/attachments/20130814/20b68307/attachment.sig>