[Interest] Heavily Commented Example: Simple Single Frontend with Two Backends

Tue Oct 23 21:58:41 CEST 2012

I did some googling and found this
http://lists.qt.nokia.com/pipermail/qt4-feedback/2009-October/000880.html -
you are saying that it's ok to use volatile on x86_64 here since there is
no actual synchronization here. Or am I missing something here and cases
are different? Sorry for nitpicking, I am just trying to fully understand
what's happening in this case

On Tue, Oct 23, 2012 at 10:24 PM, Thiago Macieira <thiago.macieira at intel.com
> wrote:

> On terça-feira, 23 de outubro de 2012 12.14.37, Till Oliver Knoll wrote:
> > - Use "QAtomicInt" instead of a "volatile bool" (for the simple "Stop
> > thread" use case) or
>
> The problems with volatile bool:
>
> 1) volatile wasn't designed for threading.
>
> It was designed for memory-mapped I/O. Its purpose is to make sure that
> there
> are no more and no fewer reads from the variable and writes to it than what
> the code does. If I write:
>         a = 1;
>         a = 1;
> I want the compiler to store 1 twice. If this is MMIO, then I might need
> that
> value of 0x01 sent twice over my I/O device.
>
> For threading, however, that's irrelevant. Storing the same value twice,
> especially sequentially like that, makes no sense. I won't bother
> explaining
> why because you can see it with little thought.
>
> What's more, CPU architectures don't work like that either. Writes are
> cached
> and then sent to the main RAM and other CPUs later, in bursts. Writing
> twice
> to memory, especially sequentially, will almost certainly result in RAM
> being
> written to only once. And besides, there's no way to detect that a
> location in
> memory has been overwritten with the same value.
>
> For those reasons, the semantics of volatile don't match the needs of
> threading.
>
> 2) volatile isn't atomic.
>  a) for all types
>
> All CPU architectures I know of have at least one size that they can read
> and
> write in a single operation. It's the machine word, which usually
> corresponds
> to the register size.
>
> Complex and modern CPUs are often able to read and write data types of
> different sizes in atomic operations, but there are many examples of CPUs
> that
> can't do it. The only way to store an 8-bit value is to load the entire
> word
> where that 8-bit value is located, merge it in and then store the full
> word. A
> read-modify-write sequence is definitely not an atomic store.
>
> The C++ bool type is 1 byte in size, so it suffers from this problem. So
> here
> we have a conclusion: you'd never use volatile bool, you'd use volatile
> sig_atomic_t (a type that is required by POSIX to have atomic loads and
> stores).
>
>  b) for all operations
>
> Even if you follow the POSIX recommendations and use a sig_atomic_t for
> your
> variable, most other operations aren't atomic. On most architectures,
> incrementing and decrementing isn't atomic. And if you're trying to do
> thread
> synchronisation, you often need higher operations like fetch-and-add,
> compare-
> and-swap or simple swap.
>
> 3) volatile does not (usually) generate memory barriers
>
> There are two types of memory barriers: compiler and processor ones. Take
> the
> following code:
>
>         value = 123456;
>         spinlock = 0;
>
> Where spinlock is a volatile int. Two levels of things might go wrong
> there:
> first, since there's no compiler barrier, the compiler might generate code
> that
> stores the 0 to the spinlock (unlocking it) before it generates the code
> that
> saves the more complex value to the other variable.
>
> I'm not even talking hypotheticals or obscure architectures. This is what
> the
> ARMv7 compiler generated for me:
>
>         movw    r1, #57920
>         mov     r0, #0
>         movt    r1, 1
>         str     r0, [r2, #0]
>         str     r1, [r3, #0]
>
> This example was intentional because I knew that ARM can't load a large
> value
> to a register in a single instruction. Loading 123456 requires two
> instructions (move and move top). So I expected the compiler to schedule
> the
> saving of 0 to before the saving of the more complex value and it did.
>
> And even when it does schedule things in the correct order, the memory
> barrier
> might be missing. Taking again the example of ARMv7, saving a zero to
> "value"
> and unlocking the mutex:
>
>         mov     r1, #0
>         str     r1, [r2, #0]
>         str     r1, [r3, #0]
>
> The ARMv7 architecture, unlike x86, *does* allow the processor to write to
> main RAM in any order. That means another core could see the the spinlock
> being unlocked *before* the new value is stored, even if the compiler
> generated the proper instructions. It's missing the memory barrier
> instruction.
>
> The Qt 4 QAtomicInt API does not offer a load-acquire or a store-release
> operation. All reads and writes are non-atomic and may be problematic --
> you
> can work around that by using a fetch-and-add of zero for load or a
> fetch-and-
> store for store.
>
> The Qt 5 API does offer the right functions and even requires you to think
> about it.
>
> The reason I said "usually" is because there is one architecture whose ABI
> requires acquire semantics for volatile loads and release semantics for
> volatile stores. That's IA-64, an architecture that was introduced after
> multithreading became mainstream and has a specific "load acquire"
> instruction
> anyway. The IA-64 manual explaining the memory ordering and barriers is
> one of
> the references I use to study the subject.
>
> 4) compilers have bugs
>
> In this case, there's little we can do but work around them. This problem
> was
> found by the kernel developers in GCC. They had a structure like:
>
>         int field1;
>         volatile int field2;
>
> On a 64-bit architecture, to modify "field1", the compiler generated a full
> read-modify-write of the full 64-bit word, including the overwriting of the
> volatile field. In other words, the compiler was clearly violating the
> volatile
> specs, since it generated a write to a volatile that didn't exist in the
> source code.
>
> In this particular case, QAtomicInt wouldn't protect you.
>
> --
> Thiago Macieira - thiago.macieira (AT) intel.com
>   Software Architect - Intel Open Source Technology Center
>
> _______________________________________________
> Interest mailing list
> Interest at qt-project.org
> http://lists.qt-project.org/mailman/listinfo/interest
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/interest/attachments/20121023/f0b1b8d9/attachment.html>