[Development] Using SSE/NEON in Qt 6

Lars Knoll lars.knoll at qt.io
Thu Feb 6 21:17:02 CET 2020


> On 6 Feb 2020, at 19:29, Allan Sandfeld Jensen <kde at carewolf.com> wrote:
> 
> On Donnerstag, 6. Februar 2020 12:45:51 CET Lars Knoll wrote:
>> One problem is, that we can only get full benefit out of those if we can
>> offer them inline. That would basically imply making our qsimd_p.h header
>> public and including that one from qvectornd.h and qmatrixnxn.h (so that we
>> can implement the operations using the SSE/NEON intrinsics). If we do that,
>> we could e.g. implement QVector4D holding a __m128 value (and the neon
>> equivalent on ARM).
> 
> One option is also to declare QVector4D as 16 byte aligned. Then it can still 
> be read from and written to fast by SSE code, even if it isn't declared as 
> holding a __m128 value. (unaligned load isn't much faster than aligned load on 
> modern architectures, but aligned reads can also be arguments to other 
> instructions saving many load instructions).

We could, but that does unfortunately only give us half the benefit. If we declare it as a 16byte vector type, it’ll get passed in a single register. That won’t happen if it’s an array of 4 floats.
> 
>> I personally don’t think including qsimd.h (and implicitly immintrin.h) from
>> our public headers would be a problem, but I’d be happy to hear arguments
>> for/against it.
> I don't think it is a problem either. I just don't want to be the one 
> documenting it ;)

I’d probably mark it as \internal ;-)
> 
>> As a side note: SSE 4.1 offers some nice additional instructions that would
>> simplify some of the operations. Should we keep the minimum requirement for
>> SSE at version 2, or can we raise it to 4.1?
> 
> That would be great. Especially for QtCore. Though we could start by just 
> making the default SSE4.1 enabled but still offer users (linux distros 
> really), the option to force it down to only SSE2. 

We should at least default to SSE2 as the minimum requirement. We’ll anyway need some pure C/C++ code as fallback, so if someone really needs to run it on a CPU from 1999, he can compile things with SSE turned off.
> 
> You could do the same with NEON, but I think we already use that 
> unconditionally if detected at configure time.

We do use NEON conditionally in some cases (e.g. qimagescale.cpp), but for most embedded devices it simply makes most sense to do the choice at Qt configuration time.

Cheers,
Lars



More information about the Development mailing list