[Interest] SIMD accelerated wrappers for relevant Qt container classes (QVector, ...) or QByteArray?

Thu May 18 13:51:41 CEST 2017

Thiago Macieira wrote:

> Then you should consider QVector3D and QVector4D. And Qt3D is adding SIMD-
> optimised versions of those two, along with that of QMatrix4x4. For now, they
> are private API, but if there's interest, we could consider making them
> public.

i've been dusting off my MacSTL fork, esp. after the suggestion that auto-
vectorisation would have made the kind of acceleration it (MacSTL) provides 
largely redundant.
Indeed, there is no more gain across the board for the benchmarking examples 
(disregarding the trigonometry test which approximates sin() and cos()). But 
even if we leave in the middle whether there ever was the same across-the-board 
gain on X86 as there was on PPC with Altivec there are still a number of 
operations that show significant speed gains (and "elementary" ones at that).
The gain is particularly interesting with clang 4.0 with LTO (though some 
results seem too good to be true).

clang-4.0 -O3 -march=native, a 2.7Ghz 2011 i7 CPU, OS X 10.9.5:
https://github.com/RJVB/MacSTL/blob/master/bm-mstl2-clang40-i7-2.log
ditto, + LTO:
https://github.com/RJVB/MacSTL/blob/master/bm-mstl2-clang40lto-i7-2.log

> Also, why not Eigen?

Or Vc. Mostly the need to change APIs. Not to say there are no places where that 
would be more than justified, of course.

R.