[Qt-interest] Error in QWaitCondition example or am I missing something?

Thiago Macieira thiago.macieira at trolltech.com
Fri Aug 21 11:44:55 CEST 2009


Em Sexta-feira 21 Agosto 2009, às 10:01:48, Christopher Dyken escreveu:
> The example is as follows:

Hi Christopher

This is a very simple example. It's artificially dumb, trying to show only 
QWaitCondition. The code is not meant to be taken literally, nor is it a good 
model on how to do producer-consumer.

In a proper producer-consumer situation, you'd use more than just 
QWaitCondition. At the very least here, I'd use a QSemaphore to indicate that 
all threads have finished working.

Moreover, I would never wait for all threads to be idle. A typical producer-
consumer requires one thread to work, not all three. But that's still a case 
for using QSemaphore.

> // consumer thread
>  forever {
>      mutex.lock();
>      keyPressed.wait(&mutex);
>      ++count;
>      mutex.unlock();
> 
>      do_something();
> 
>      mutex.lock();
>      --count;
>      mutex.unlock();
>  }
> 
> // producer thread
>  forever {
>      getchar();
> 
>      mutex.lock();
>      // Sleep until there are no busy worker threads
>      while (count > 0) {
>          mutex.unlock();
>          sleep(1);
>          mutex.lock();
>      }
>      keyPressed.wakeAll();
>      mutex.unlock();
>  }
> 
> The "count" variable makes sure that all consumers are finished
> processing before the next character is processed.
> 
> Consider the following scheduling scenario:
> 
> 1 producer thread - fetch a char, lock mutex, count ==0, wake all
> waiting threads (which is zero), unlock mutex, scheduled out at end of
> iteration
> 2 consumer thread 0 - lock mutex, unlock and wait
> 3 consumer thread 1 - lock mutex, unlock and wait
> 4 consumer thread 2 - lock mutex, unlock and wait

2, 3 and 4 don't happen at this stage. I think you have the order inverted.

It should be (using your numbers):
	2
	3
	4
	1

Remember that the threads are started before the producer thread enters the 
"forever" loop. So they should all be sleeping on "wait" before getchar 
happens.

The example doesn't show it, but there has to be a pre-loop synchronisation 
step. You must ensure that all threads are ready to be used before the 
producer starts producing.

That can be accomplished by initialising count to 3 and rewriting the consumer 
thread loop to this:

// consumer thread
 forever {
     mutex.lock();
     --count;
     keyPressed.wait(&mutex);
     ++count;
     mutex.unlock();

     do_something();
 }

So, assuming that the synchronisation was done, and the three consumer threads 
are waiting before your step 1, your step 5 needs to be rewritten and is out 
of order:

> 5 producer thread - fetch the second char, lock mutex, count == 0,
> wake all waiting threads (which is three), unlock mutex
> 6 consumer thread 0 - woken, process the second char of input,
> scheduled out at end of iteration
> 7 consumer thread 1 - woken, process the second char of input,
> scheduled out at end of iteration
> 8 consumer thread 1 - woken, process the second char of input,
> scheduled out at end of iteration

Steps 6, 7 and 8 happen before 5. The three consumer threads are woken up and 
are waiting on the mutex lock.

As soon as the producer thread unlocks it at the end of step 1, those three 
threads capture it and start working. There's a race condition here, but I'll 
get back to it later.

So step 5 is:
5 producer thread - fetch the second char, lock mutex, count > 0, enter loop

Steps 6, 7 and 8 are the three consumer threads waking up and processing the 
first character. In the mean time, the producer thread has read the second 
character and is in the while (count > 0) loop.

When all three consumer threads finish do_something and decrease count (i.e., 
steps 6, 7 and 8 are finished), the producer thread exits the while loop and 
calls wakeAll again.

At that point, the consumer threads start working on the *second* character, 
while the producer thread moves on to reading the third. So step 9 (which is 
wrong too) becomes;

> 9 producer thread - fetch the third char, count == 0, wake all waiting
> threads (which is zero)

9 producer thread - fetch the third char, lock mutex, count > 0, enter loop

Which is identical to step 5. We've entered the continuous regime here.

> and so on. As I see it, only every other char is processed in this
> case. The problem is that if a wake-call is issued between the
> consumers unlock the mutex after --count; and before the lock on mutex
> before keyPressed.wait, the consumer will miss out on that iteration.
> 
> At least this is how I see it. Maybe am I missing something here?

If I rewrite the order of things, maybe this makes more sense now:

0  consumer threads started, they all reach wait()
1  producer thread - fetch the first char, lock mutex, count == 0, wakes all 
threads
2  consumer thread 1 - woken up, blocks on mutex
3  consumer thread 2 - woken up, blocks on mutex
4  consumer thread 3 - woken up, blocks on mutex
5  producer thread - unlocks mutex, fetches second char
6  consumer thread 1 - ++count, do_something
7  consumer thread 2 - ++count, do_something
8  consumer thread 3 - ++count, do_something
9  producer thread - locks mutex, count > 0, enters loop
10 consumer thread 1 - work finishes, lock mutex, --count
11 consumer thread 2 - work finishes, lock mutex, --count
12 consumer thread 3 - work finishes, lock mutex, --count
13 producer thread - count == 0, wakeAll
14 go to step 2

The problem that I can see in the example is the race condition I mentioned 
before.

When the producer calls wakeAll, all consumer threads are scheduled for 
execution. They however block on the mutex, waiting for the producer to let 
them go.

When the producer calls unlock(), the consumer threads are allowed to continue 
executing. The problem is that the producer thread can too. So it's entirely 
possible that the producer thread will do that (since it's actually executing 
already). It will read the second character, lock the mutex and will see count 
== 0. So it will call wakeAll() again, but this time there are no threads 
waiting to be woken up.

When it unlocks() again, the same happens.

So it's perfectly possible that the producer will continue running until it 
runs out of its CPU timeslice, without letting any of the consumer threads run 
and process. So it's possible, with this example, to have data being lost.

To prevent that data loss, we need another synchronisation somewhere. It's not 
shown in the example: it has to be in getchar() and do_something(). For 
example, getchar() could be writing to a queue, which means there is no data 
loss.

But the point is that the QWaitCondition documentation is not complete: it's 
artificially simple, in order to illustrate *only* QWaitCondition. Please check 
the actual examples supplied with Qt to see a more complete producer-consumer 
system.

-- 
Thiago Macieira - thiago.macieira (AT) nokia.com
  Senior Product Manager - Nokia, Qt Development Frameworks
     Sandakerveien 116, NO-0402 Oslo, Norway

Qt Developer Days 2009 | Registration Now Open!
Munich, Germany: Oct 12 - 14     San Francisco, California: Nov 2 - 4
      http://qt.nokia.com/qtdevdays2009
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.qt-project.org/pipermail/qt-interest-old/attachments/20090821/a990cfa7/attachment.bin 


More information about the Qt-interest-old mailing list