[Qt-interest] Error in QWaitCondition example or am I missing something?
Thiago Macieira
thiago.macieira at trolltech.com
Fri Aug 21 11:44:55 CEST 2009
Em Sexta-feira 21 Agosto 2009, às 10:01:48, Christopher Dyken escreveu:
> The example is as follows:
Hi Christopher
This is a very simple example. It's artificially dumb, trying to show only
QWaitCondition. The code is not meant to be taken literally, nor is it a good
model on how to do producer-consumer.
In a proper producer-consumer situation, you'd use more than just
QWaitCondition. At the very least here, I'd use a QSemaphore to indicate that
all threads have finished working.
Moreover, I would never wait for all threads to be idle. A typical producer-
consumer requires one thread to work, not all three. But that's still a case
for using QSemaphore.
> // consumer thread
> forever {
> mutex.lock();
> keyPressed.wait(&mutex);
> ++count;
> mutex.unlock();
>
> do_something();
>
> mutex.lock();
> --count;
> mutex.unlock();
> }
>
> // producer thread
> forever {
> getchar();
>
> mutex.lock();
> // Sleep until there are no busy worker threads
> while (count > 0) {
> mutex.unlock();
> sleep(1);
> mutex.lock();
> }
> keyPressed.wakeAll();
> mutex.unlock();
> }
>
> The "count" variable makes sure that all consumers are finished
> processing before the next character is processed.
>
> Consider the following scheduling scenario:
>
> 1 producer thread - fetch a char, lock mutex, count ==0, wake all
> waiting threads (which is zero), unlock mutex, scheduled out at end of
> iteration
> 2 consumer thread 0 - lock mutex, unlock and wait
> 3 consumer thread 1 - lock mutex, unlock and wait
> 4 consumer thread 2 - lock mutex, unlock and wait
2, 3 and 4 don't happen at this stage. I think you have the order inverted.
It should be (using your numbers):
2
3
4
1
Remember that the threads are started before the producer thread enters the
"forever" loop. So they should all be sleeping on "wait" before getchar
happens.
The example doesn't show it, but there has to be a pre-loop synchronisation
step. You must ensure that all threads are ready to be used before the
producer starts producing.
That can be accomplished by initialising count to 3 and rewriting the consumer
thread loop to this:
// consumer thread
forever {
mutex.lock();
--count;
keyPressed.wait(&mutex);
++count;
mutex.unlock();
do_something();
}
So, assuming that the synchronisation was done, and the three consumer threads
are waiting before your step 1, your step 5 needs to be rewritten and is out
of order:
> 5 producer thread - fetch the second char, lock mutex, count == 0,
> wake all waiting threads (which is three), unlock mutex
> 6 consumer thread 0 - woken, process the second char of input,
> scheduled out at end of iteration
> 7 consumer thread 1 - woken, process the second char of input,
> scheduled out at end of iteration
> 8 consumer thread 1 - woken, process the second char of input,
> scheduled out at end of iteration
Steps 6, 7 and 8 happen before 5. The three consumer threads are woken up and
are waiting on the mutex lock.
As soon as the producer thread unlocks it at the end of step 1, those three
threads capture it and start working. There's a race condition here, but I'll
get back to it later.
So step 5 is:
5 producer thread - fetch the second char, lock mutex, count > 0, enter loop
Steps 6, 7 and 8 are the three consumer threads waking up and processing the
first character. In the mean time, the producer thread has read the second
character and is in the while (count > 0) loop.
When all three consumer threads finish do_something and decrease count (i.e.,
steps 6, 7 and 8 are finished), the producer thread exits the while loop and
calls wakeAll again.
At that point, the consumer threads start working on the *second* character,
while the producer thread moves on to reading the third. So step 9 (which is
wrong too) becomes;
> 9 producer thread - fetch the third char, count == 0, wake all waiting
> threads (which is zero)
9 producer thread - fetch the third char, lock mutex, count > 0, enter loop
Which is identical to step 5. We've entered the continuous regime here.
> and so on. As I see it, only every other char is processed in this
> case. The problem is that if a wake-call is issued between the
> consumers unlock the mutex after --count; and before the lock on mutex
> before keyPressed.wait, the consumer will miss out on that iteration.
>
> At least this is how I see it. Maybe am I missing something here?
If I rewrite the order of things, maybe this makes more sense now:
0 consumer threads started, they all reach wait()
1 producer thread - fetch the first char, lock mutex, count == 0, wakes all
threads
2 consumer thread 1 - woken up, blocks on mutex
3 consumer thread 2 - woken up, blocks on mutex
4 consumer thread 3 - woken up, blocks on mutex
5 producer thread - unlocks mutex, fetches second char
6 consumer thread 1 - ++count, do_something
7 consumer thread 2 - ++count, do_something
8 consumer thread 3 - ++count, do_something
9 producer thread - locks mutex, count > 0, enters loop
10 consumer thread 1 - work finishes, lock mutex, --count
11 consumer thread 2 - work finishes, lock mutex, --count
12 consumer thread 3 - work finishes, lock mutex, --count
13 producer thread - count == 0, wakeAll
14 go to step 2
The problem that I can see in the example is the race condition I mentioned
before.
When the producer calls wakeAll, all consumer threads are scheduled for
execution. They however block on the mutex, waiting for the producer to let
them go.
When the producer calls unlock(), the consumer threads are allowed to continue
executing. The problem is that the producer thread can too. So it's entirely
possible that the producer thread will do that (since it's actually executing
already). It will read the second character, lock the mutex and will see count
== 0. So it will call wakeAll() again, but this time there are no threads
waiting to be woken up.
When it unlocks() again, the same happens.
So it's perfectly possible that the producer will continue running until it
runs out of its CPU timeslice, without letting any of the consumer threads run
and process. So it's possible, with this example, to have data being lost.
To prevent that data loss, we need another synchronisation somewhere. It's not
shown in the example: it has to be in getchar() and do_something(). For
example, getchar() could be writing to a queue, which means there is no data
loss.
But the point is that the QWaitCondition documentation is not complete: it's
artificially simple, in order to illustrate *only* QWaitCondition. Please check
the actual examples supplied with Qt to see a more complete producer-consumer
system.
--
Thiago Macieira - thiago.macieira (AT) nokia.com
Senior Product Manager - Nokia, Qt Development Frameworks
Sandakerveien 116, NO-0402 Oslo, Norway
Qt Developer Days 2009 | Registration Now Open!
Munich, Germany: Oct 12 - 14 San Francisco, California: Nov 2 - 4
http://qt.nokia.com/qtdevdays2009
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.qt-project.org/pipermail/qt-interest-old/attachments/20090821/a990cfa7/attachment.bin
More information about the Qt-interest-old
mailing list