[Development] Using SSE/NEON in Qt 6
Allan Sandfeld Jensen
kde at carewolf.com
Thu Feb 6 19:29:01 CET 2020
On Donnerstag, 6. Februar 2020 12:45:51 CET Lars Knoll wrote:
> One problem is, that we can only get full benefit out of those if we can
> offer them inline. That would basically imply making our qsimd_p.h header
> public and including that one from qvectornd.h and qmatrixnxn.h (so that we
> can implement the operations using the SSE/NEON intrinsics). If we do that,
> we could e.g. implement QVector4D holding a __m128 value (and the neon
> equivalent on ARM).
One option is also to declare QVector4D as 16 byte aligned. Then it can still
be read from and written to fast by SSE code, even if it isn't declared as
holding a __m128 value. (unaligned load isn't much faster than aligned load on
modern architectures, but aligned reads can also be arguments to other
instructions saving many load instructions).
> I personally don’t think including qsimd.h (and implicitly immintrin.h) from
> our public headers would be a problem, but I’d be happy to hear arguments
> for/against it.
I don't think it is a problem either. I just don't want to be the one
documenting it ;)
> As a side note: SSE 4.1 offers some nice additional instructions that would
> simplify some of the operations. Should we keep the minimum requirement for
> SSE at version 2, or can we raise it to 4.1?
That would be great. Especially for QtCore. Though we could start by just
making the default SSE4.1 enabled but still offer users (linux distros
really), the option to force it down to only SSE2.
You could do the same with NEON, but I think we already use that
unconditionally if detected at configure time.
Regards
'Allan
More information about the Development
mailing list