[Interest] [Qt-interest] Regarding dependent orthogonal/parallel states in Qt statemachine

K. Frank kfrank29.c at gmail.com
Tue Jan 3 22:32:44 CET 2012


Hello Srikanth!

Happy New Year to You, as well, and to all Qt Aficionados!

(I am cross-posting to "interest at qt-project.org" because I think that
there is some plan to deprecate "qt-interest at qt.nokia.com".)

On Tue, Jan 3, 2012 at 12:28 PM,  <srikanth.1.kyatham at nokia.com> wrote:
> Hello Frank,
>
>  Happy new year. Thanks for your answers, I couldn't reply soon as I was away for few days.
>
> Again
> "For complicated problems -- those real problems where
> state machines can actually be helpful -- putting all of your state into
> the state machine's official state variable -- can lead to overly complicated
> state machines that suffer from an explosion of states.  In such cases
> putting some of your "state" data into "extended state" variables can
> help you manage this problem.  How to divide up your "state" information
> between the the state machine's official state variable and its extended
> state data is one of the key design issues you face when designing a
> state machine.  And like all such design trade-offs, there is no a priori
> right answer -- how to balance the trade-off will depend on the details
> of the specific problem you are trying to solve."
>
> You have mentioned the extended state variables, could you tell me more about this.
> I did not understand the applicability, how to use, where it could be used.
> Do you mean by the data variables of the state. What are those. I did not find any example
> using it. Could you point me to one. Perhaps I can go through and improve the logic.

Let me try to explain this with a couple of examples.  The examples
are simplified and a little bit contrived, but I think you can imagine
real-world cases that these examples are similar to.

Let me apologize in advance for the lengthy post:  I think it will
be  helpful to make the examples concrete and detailed.

Also, I will illustrate things with a very elementary pseudo-code
implementation of a state machine.  I won't bother with the Qt-style,
object-oriented state-machine approach where the specific state is
specified by a polymorphic pointer of type pointer-to-base-state that
points to a concrete instance of a class derived from base-state.

Let's say you have a situation where two tasks, A and B, have to be
completed before you can move on to something else, but it doesn't
matter in what order they are completed.  (For example, maybe two
asynchronous database queries have to complete.)  When the task are
done, you can collect a prize.

Your states might be:

enum states {
  NeitherDone,
  ADone,
  BDone,
  AAndBDone,
  PrizeDelivered
};

Let's say your events are:

enum events {
  ACompleted,
  BCompleted,
  PrizeRequested,
  Reset
};

When an event comes in you have a classic "double-dispatch"
problem -- that is, what you do depends upon two things (the
"double" in "double dispatch"): the state and the event.

To be very simple-minded about this, let's have our state machine
be (an instance of) a class, and use nested switch statements for
the double-dispatch event processing:

class state_machine {
  state_machine() : current_state (NeitherDone) {}
  states current_state;
  process_event (events e) {
    switch (current_state) {
      case NeitherDone:
        switch (e) {
          case ACompleted:  return ADone;
          case BCompleted:  return BDone;
          case Reset:  return NeitherDone;
          default:  throw ("illegal event");
        }
      case ADone:
        switch (e) {
          case BCompleted:  return AAndBDone;
          case Reset:  return NeitherDone;
          default:  throw ("illegal event");
        }
      case BDone:
        switch (e) {
          case ACompleted:  return AAndBDone;
          case Reset:  return NeitherDone;
          default:  throw ("illegal event");
        }
      case AAndBDone:
        switch (e) {
          case PrizeRequested:  return PrizeDelivered;
          case Reset:  return NeitherDone;
          default:  throw ("illegal event");
        }
      case PrizeDelivered:
        switch (e) {
          default:  throw ("illegal event");
        }
      default:  throw ("oops -- bad state");
    }
  }
  void process_events() {
    while (true) {
      event e = getEvent();
      process_event (e);
    }
  }
};

This simple example illustrates some of the classic benefits of
state machines.  It makes very clear when and when not certain
events are allowed -- for example, you can only have one ACompleted
event (unless there is an intervening Reset event).  And, while
the future evolution of the state machine depends on the past
history of the events is has received, the state machine "remembers"
only that aspect of the past history that actually matters and
"forgets" any irrelevant detail.

Specifically, in order to be able to able to collect your prize
(PrizeRequested event), both ACompleted and BCompleted have to
have occurred, but it doesn't matter in which order they occurred.
The state AAndBDone "remembers" that both ACompleted and BCompleted
have occurred, but explicitly "forgets" in which order they occurred.

*Everything* you need to know about how the state machine will
respond to future events is contained in the "state" of the
state machine, and this state is encoded entirely in the single
state variable, current_state.

(Of course, a problem this simple could easily be solved with
old-fashioned if-then programming, and for a problem this simple
I probably wouldn't bother with a formal state machine.)

Now let's add some more tasks to the problem.  The *logic* of
problem will stay the same and remain equally simple, but the
detailed structure of the state machine will become more complex.

Let's say to collect the prize you also have to complete tasks
C and D.

The events don't get much worse:

enum events {
  ACompleted,
  BCompleted,
  CCompleted,
  DCompleted,
  PrizeRequested,
  Reset
};

That's legitimate.  The size of the events enumeration grows
like the number of distinct events.  That's perfectly reasonable,
and how could you do better?

But the states get significantly worse:

enum states {
  NoneDone,
  ADone,
  BDone,
  CDone,
  DDone,
  AAndBDone,
  AAndCDone,
  AAndDDone,
  BAndCDone,
  BAndDDone,
  CAndDDone,
  AAndBAndCDone,
  AAndBAndDDone,
  AAndCAndDDone,
  BAndCAndDDone,
  AllDone,
  PrizeDelivered
};

The number of states grows exponentially in the number of tasks.
Here we have 16 = 2^4 states (plus the additional PrizeDelivered
state).  This is the "explosion of states" that often afflicts
state machines.  There are some purists who say that this is how
you should design a state machine: that there should be a single,
well-defined state variable (current_state, in this example) and
that the entire state should be encoded in it.  They would also
say that the disadvantages of the "explosion of states" are outweighed
by keeping the design of the state machine "pure."

Needless to say, I disagree that one should take pure design to
this extreme.

This is where "extended state data" comes in.

We still want to package everything we need to know about how
the state machine will respond to future events in something
we will denote the "state" of the state machine.  The state
will now consist of the state variable (current_state) and all
additional "extended state data."

Our new state values are:

enum states {
  NotAllDone,
  AllDone,
  PrizeDelivered
};

Our extended state data consists of the following boolean variables:

bool ADone;
bool BDone;
bool CDone;
bool DDone;

Our state machine becomes:

class state_machine {
  state_machine() :
    current_state (NotAllDone),
    ADone (false),
    BDone (false),
    CDone (false),
    DDone (false)
  {}

  // total state  =  state variable  +  extended state data
  states current_state;  // state variable
  bool ADone;  // extended state data
  bool BDone;  // extended state data
  bool CDone;  // extended state data
  bool DDone;  // extended state data

  process_event (events e) {
    switch (current_state) {
      case NotAllDone:
        switch (e) {
          case ACompleted:
            if (ADone)  throw ("illegal event");
            ADone = true;
            if (ADone && BDone && CDone && DDone)  return AllDone;
            return NotAllDone;
          case BCompleted:
            if (BDone)  throw ("illegal event");
            BDone = true;
            if (ADone && BDone && CDone && DDone)  return AllDone;
            return NotAllDone;
          case CCompleted:
            if (CDone)  throw ("illegal event");
            CDone = true;
            if (ADone && BDone && CDone && DDone)  return AllDone;
            return NotAllDone;
          case DCompleted:
            if (DDone)  throw ("illegal event");
            DDone = true;
            if (ADone && BDone && CDone && DDone)  return AllDone;
            return NotAllDone;
          case Reset:
            ADone = BDone = CDone = DDone = false;
            return NotAllDone;
          default:  throw ("illegal event");
        }
      case AllDone:
        switch (e) {
          case PrizeRequested:  return PrizeDelivered;
          case Reset:
            ADone = BDone = CDone = DDone = false;
            return NotAllDone;
          default:  throw ("illegal event");
        }
      case PrizeDelivered:
        switch (e) {
          default:  throw ("illegal event");
        }
      default:  throw ("oops -- bad state");
    }
  }
  void process_events() {
    while (true) {
      event e = getEvent();
      process_event (e);
    }
  }
};

That's the basic idea of extended state data.

What benefits of a "pure" state machine do we keep?  The transitions
we make in response to events (the "topology" of the state machine)
still depend solely on the specific event and the "total state" of
the machine.

What do we lose?  The state of the machine (the "total state") is no
longer encoded in a single "state variable," but also includes the
"extended state data."

We still perform an explicit double dispatch on the event and the
state variable (again implemented with nested switch statements),
but the full total-state-dependent dispatch now requires (in this
design) if-then code that inspects the extended state variables.

What do we gain?  The number of formal states (values of the
state variable) is now constant, and no longer grows with the
number of tasks to complete.  Of course, we now need an extended
state variable for each task.  However, the number of extended
state variables, *and* the total complexity of the state machine,
e.g., the length of the dispatch code, only grows linearly in
the number of tasks.

(As implemented, the length of the dispatch code actually grows
quadratically in the number of tasks:  There is a case label for
each task-completed event, and in each case block we inspect all
of the task-done extended state variables.  But with a little
rearranging and collapsing most of the case blocks together, we
could have had the dispatch code grow only linearly with the number
of tasks.)

As I mentioned in my earlier post, how best to divide your "total state"
between the "state variable" and your "extended state data" is a question
of making various design trade-offs for which there is no single "right"
answer.

You might play with this example a little more by asking how you would
modify it if the rule were that you must complete each of tasks A, B, C,
and D, say, five times before you can collect your prize.  How many
distinct states would you need if you didn't use extended state data?

> Regards,
> Srikanth


Good luck, and Happy New Year Hacking!


K. Frank



More information about the Interest mailing list