[Interest] Heavily Commented Example: Simple Single Frontend with Two BackendsHi,

Wed Oct 24 15:17:34 CEST 2012

2012/10/24 Daniel Bowen <qtmailinglist1 at bowensite.com>:
>> ...
> I'm also quite interested in this topic. There are a handful of places where
> I've used a similar pattern
> ...
> bool A::stopAndWait(unsigned long timeoutMs)
> {
>         m_stop = true;
>         m_waitCondition.wakeAll();
>         return this->wait(timeoutMs);
> }
>
> So, in that case, I've typically declared
> volatile bool m_stop;

That's /exactly/ how I used this pattern so far.

> With the intention of using volatile for "don't optimize the read away".
>
> After reading some various links like the Herb Sutter article referenced
> earlier, a co-worker believes that volatile in this case is unnecessary.
> i.e., just use
> bool m_stop;

After what I've read and learnt so far, I think for a "simple" type
like bool (or int, or float for that matter - and please note my
carefully chosen word "think" ;)) the volatile is still necessary!
Simply to tell the *compiler* "don't optimise the read away!"

We have seen so far - at least I *think* we did - that the CPU might
decide to read it from some Core/CPU specific cache instead from main
RAM (or the "common" Level 3 cache), but at least on x86 architecture
those caches are synchronous or the ABI will make sure that they will
be synchronous, and on ARM they *eventually* will be updated - at some
magic point in time (and on other architectures such as SPARC or
Alpha, who knows...).

So if you have

void MyWorkerThread::doFoo()
{
  m_stop = false;
  while (!m_stop) {....};
}

and m_stop was /not/ volatile, the *compiler* might decide to optimise
away the read from 'm_stop', as it would figure out that m_stop was
never modified inside the loop, resulting in code equivalent to:

void MyWorkerThread::doFoo()
{
  while (!false) {... endless loop ...}
}

Now again I think that if m_stop was of type QAtomicInt - a class, or
a "non-simple" data type - and you had

  m_stopAtomicInt = 0;
  while (m_stopAtomicInt == 0) {...}

then the compiler would not optimise that read away, because in fact a
method call of the overloaded operator == was involved, so in such a
case volatile would not be necessary - in my opinion.

(Disclaimer: I haven't studied the QAtomicInt API in detail, so not
sure if the above would actually made sense - but my point was rather
whether "volatile" was needed in the declaration of m_stopAtomicInt or
not).

In the following please prefix a "In my opinion" in front of each of
my statements, until someone confirms them ;)

> So, I'd like to understand the possibilities for the best cross-platform
> code.  Going "one time too many" is OK.
> - Is reading and writing to a bool m_stop (no volatile) without a mutex
> locked OK? Or could the read of the member variable m_stop realistically be
> optimized away?

The compiler could optimise it away (see above), so a volatile is necessary.

> - Is reading and writing to a volatile bool m_stop without a mutex locked
> OK?

Yes, because at least on x86 ("sync'ed caches") and on ARM ("caches
*will* be synced eventually") with the current use case it is okay
when we *eventually* see that the flag has been set. No worries if we
do "a few loops too much".

Also out-of-order CPU instruction executions are no problems, because
again, eventually the flag will be set and the worker thread will see
it.

> - If no locking is OK, is volatile better for m_stop, or does it not matter
> (and just causes a little bit slower execution for the read/write of
> m_stop)?

Similar question as #1: it is not only better, but required.

> - Is not using volatile on bool m_stop OK, but both reading and writing to
> m_stop should be done with a mutex locked?

According so some previous post the mutex acquisition implicitly does
a "memory/cache syncing". But the compiler could still decide to
"optimise the bool away" - be it protected by a mutex or not.

> - If m_stop is only read or written with a mutex locked, could the value
> ever be stale that is read (causing "one time too many")?

See previous question: in my understanding of what has been stated
previously in this "thread" the implementation of QMutex will make
sure that memory is sync'ed before you enter the Critical Section.
Given the apparent complexity of that matter the question is off
course: "on which platforms (which don't do that themselves already
anyway) is that guaranteed".

> - What if m_stop was QAtomicInt instead of bool?

It was first my understanding that also QAtomicInt would make sure
that memory gets synced (also on those platforms where the
architecture does not do it automagically), but Thiago issued some
comments later which made me unsure again: something like "it works
(or does not work at all) differently on Qt 4 than on Qt 5"

Anyway, the conclusion for me so far about this *very interesting topic* is that

- it is okay to use a simple boolean ("for that stop use-case it works
on x86 and most likely even on ARM")
- but declare it volatile ("such that the compiler does not optimise it away")

Which is in contrast again to what I said previously - but using
QAtomicInt might be a safer bet still, so using that instead of a
"volatile bool" might be even better, especially in the future (Qt 5)
and on other platforms as well (as to really really avoid stale
caches).

Cheers, Oliver