[Development] What's Q_PRIMITIVE_TYPE for?

Lars Knoll lars.knoll at qt.io
Thu Nov 12 15:13:50 CET 2020


On 12 Nov 2020, at 13:13, Giuseppe D'Angelo via Development <development at qt-project.org<mailto:development at qt-project.org>> wrote:

Il 12/11/20 10:36, Lars Knoll ha scritto:
On 12 Nov 2020, at 03:10, Thiago Macieira <thiago.macieira at intel.com<mailto:thiago.macieira at intel.com>> wrote:

On Wednesday, 11 November 2020 10:14:26 PST Giuseppe D'Angelo via Development
wrote:
Hi,

On 11/11/2020 18:14, Thiago Macieira wrote:
So my recommendation is:
 1) deprecate Q_PRIMITIVE_TYPE and rename to Q_TRIVIAL_TYPE
 2)*not*  use memset-to-zero construction anywhere

#2 implies changing QPodArrayOps, which does use memset, to use a loop
calling the default constructor. Two of the four compilers do optimise
that into a call into memset:https://gcc.godbolt.org/z/Ks3M5h. And
there's nothing the ICC team likes to work on more than losing on a
benchmark.

The problem is that ~100% of our value classes are not trivial, because
we always initialize our data members. So, we need type traits anyhow to
distinguish between primitive/relocatable/complex; and I am against
calling it "Q_TRIVIAL_TYPE" because this property has now nothing to do
with pure triviality.

Understood, but then what's the harm of using Q_RELOCATABLE_TYPE for them?
Asked differently: if those classes initialise the members to a non-zero
value, why is memset with zero acceptable as a construction?

I was actually wondering about one idea I had for primitive types and QList. Since we do not require calling constructors or destructors there, I could significantly reduce the template bloat for those types, by moving all implementations into non inline methods that take as one additional argument the size of the type.

That idea does conflict to some extent with the idea of getting rid of memset().


Apart from this:

what would the out-of-line call do? Be exactly one line that calls memset()? Just leave it as-is then...

I’m not talking about one memset, but rather the implementation of e.g. insert(int pos, T *data, qsizetype n) and friends.


*Some* trivial types can be initialized via memset(0), but not all of
them, so the set of primitive types (according to our current
definition) and the set trivial types are intersecting (*).

I propose we initialise none with memset. Don't try.

In theory we could just rely on the optimizer to turn
std::uninitialized_value_construct_n into a memset(0). (If you have an
out of line constructor that does 0-bit initialization, and the compiler
doesn't see it and do the transformation, you don't have my sympathies.)

This would, in principle, allow for unifying handling of primitive and
relocatable types:

* Construction: use uninitialized_value_construct
  * Primitive: the compiler figures out it's a memset()
  * Relocatable: call the default constructor (and possibly the
compiler figures out it's a memset())

* Copy: just use std::uninitialized_copy
  * Primitive: the compiler figures out it's a memcpy()
  * Relocatable: call the copy constructor (possibly the compiler
figures out it's a memcpy())

* Move: just use std::uninitialized_copy
  * Primitive: the compiler figures out it's a memcpy()
  * Relocatable: call the move constructor (possibly the compiler
figures out it's a memcpy())

* Destruction: just use std::destroy
  * Primitive: compiler does nothing
  * Relocatable: call the destructors (possibly do nothing if trivial)

Agreed, except for the part of using the Standard Library functions. Just loop
around the block of memory and call the proper constructors using placement
new or the destructor.

I tend to agree, esp. as some of those methods contain rollback code for exceptions that we know can never happen for primitive types.

I don't agree: if the point of the whole exercise is trusting the compiler at turning loops filling bytes into calls to memset, then surely we can also trust the compiler to ditch exception-handling code if it figures out no exceptions can possibly escape (it's dead code).


But as far as I can tell, compilers do not do these transformations as
aggressively as we'd like. So we still have a distinct advantage at
using the trait, at least for the Qt 6 lifetime. Take for instance

QStringView:
https://gcc.godbolt.org/z/6Taoo4

GCC, ICC, MSVC don't optimize anything. Clang chokes on the
(pointer,int) scenario, but only if the initialization goes through a
constructor. Don't ask me why.

(pointer,int) isn't applicable for us any more because we don't have that
case. The reason that Clang chokes is because of the 4-byte tail-padding in
that structure. When you define a non-default constructor, the compiler
decided to expand to the exact code you wrote, instead of taking the liberty
of writing to those padding bits (which *you* can't do, but the compiler can).


There is absolutely zero difference as far as the language is concerned between the two (pointer,int) variants. It's a plain missed optimization, probably because Clang hardcoded some conditions regarding when to apply the transformation to memset, and also one that will bite us (because we do use constructors and have pointer-int pairs, hi QModelIndex). Client code not yet ported to qsizetype will also have them.

The fact that no other compiler also applies these transformations is a bit worrying regarding the "don't worry, compilers will figure this out".


I think having a trait that effectively requires "this type can be constructed
by a memset of 0" is risky, even if we've been doing it all along. First,
because it depends on the representation of pointers (that NULL is zero bits)
and it's not impossible for some PMO pointer to exist in the class, unnoticed.
It requires the developer to know the byte representation of their class.

I agree that this is tricky. We’re still using some other tricks for primitive types, like doing memcpy instead of copy constructors. Those make quite a difference in performance and I don’t think we should ditch this.

One of the two: either one trusts the compiler to use memcpy here too, or if one doesn't trust the compiler here, then one shouldn't trust it to turn default constructors into memset. Which one is it?


That all having been said: we're at RC time and I don't think this is 6.0 material any more. Q_PRIMITIVE_TYPE should not be automatic for any user-defined type anyhow (after my patch goes in), if you opt-in, you better make sure that 0-init makes sense for you...

The choice here is whether we use a loop to default initialise the element instead of memset (which is actually a rather rare use case in QList) versus pessimising things for a much larger range of objects.

That’s why I think we should be using the loop in this case and hope for the best, so that we can keep the other optimisations we have for a much larger range of classes.

Is_trivial() explicitly states that you can memcpy/memmove objects and I really want to keep this. With your change, those will by default fall back to the complex case, slowing down operations on that list.

Cheers,
Lars


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20201112/dc7889d0/attachment-0001.html>


More information about the Development mailing list