[Interest] TCP ACK with QDataStream -- OR: disconnect race condition detection

Mon Sep 10 10:08:22 CEST 2012

Am 09.09.2012 um 20:01 schrieb d3fault <d3faultdotxbe at gmail.com>:

> ...
> 
> A random guess: does bytesWritten() get triggered by a TCP ACK sent by
> the _receiver's_ internal buffer (regardless of whether my code has
> read it in or not yet (if I just ignore the readyRead()))? 

Yes! There was one time when our professor told us that "you can show off when you mention the ISO OSI-model". Unfortunatelly I have forgotten most of it by now ;)

Anyway, think of OSI as a layered cake: once one level (the "physical level", ..., the "transport level", ..., up to the "application layer") has done its duty it passes the result, including error/success codes, up to the next layer (receiver side) - or downwards (sender side).

That said, what you're observing is that on the "Transport" level (the TCP part is roughly located there) the data is received - both messages! - and the total amount of bytes is ACK'ed. So as far as your sender - on the TCP/Transport layer - is concerned everything is okay, and that's what you get (correctly) reported on the Application layer on your sender.

What the receiver of the two messages on its Application layer decides to do is entirely up to him! It might as well completely ignore those messages.

Likewise when on the sender's side you pass down the data in the Application layer via QDataStream (based upon a socket) you get as result "Yes, the Transport (TCP) layer has accepted the bytes for sending (SUCCESS) - they WILL be sent sometimes LATER (or not - check the result yourself later)".

That's due to the asynchronous nature of the Qt Application layer, as you've already noticed: at the point of "passing down the bytes" via QDataStream there is no way telling - neither for your application, nor for Qt - whether those bytes will eventually delivered successfully or not ("we'll see later about that, when we actually start sending and expecting ACKs from the receiver").

However I agree that QDataStream should fail (bytes written: -1) if there is no underlying connection at all. Not sure anymore what your observed result was in this case...

You mentioned in your previous post something like in the direction of  "At Least Once", "At Most Once" and "Exactly Once" semantics (the number of times a "message" should arrive).

I'm no network expert and as I said, it's been a while since ny network lectures ;), however from what I remember it is *not* enough to rely on a reliable connection, as to implement all of the above semantics: you need some sort of "Application Protocol" on top of the entire network cake.

For instance "At least once" is easy: you just keep sending until you receive an ACK from your counterparty *application* layer. Note that it is *not* enough to rely on the Transport layer: TCP might have acknowledged the message, but the application might have been busy doing something else, did not expect that message right now, temporarily out of resources to handle it etc. etc.. So you really need an "ACK" from the counterparty Application level - at least one!

Also note that the Transport layer (TCP) doesn't know anything about the content of the messages being delivered: they're just a bunch of bytes (+checksums, metadata etc.). So the Application layer also needs some sort of "message IDs" to tell the messages apart.

"At Most Once" is the easiest: you send it once (and don't care anymore), and the receiver gets it - or it doesn't. TCP - or even UDP - alone is absolutely sufficient, no need for an additional Application protocol.

"Exactly Once" is the hardest: the receiver sends an "ACK" (again: on the Application layer) once it has processed the message. If the sender doesn't get one, it simply sends the message again after some timeout (possibly hours, in case the network is temporarily down - but the message MUST get through eventually!).

But what if just the ACK got lost? The receiver gets a duplicate and must be able to identify it as such and discard it (but still send an ACK).

There are probably dozens of other scenarios which could fail ("message ID overflow"?) and practical implementations might rely on "statistical faith".

But the bottom line of all this is: if you really want to make your networking reliable, then you need to add some sort of protocol in your Application layer anyway.

I dare to say that for most desktop applications simply relying on TCP is good enough. Even on mobile applications where reliability of the network is a much bigger issue the user will probably abort the operation before any of your timers will time out ;)

(Or in other words: the asynchronous nature and abstraction of Qt of the underlying network is a perfect match).

For all other cased I recommend a good networking book and you probably want to get dirty by directly operating with sockets, too.

Cheers,
  Oliver