[Interest] TCP ACK with QDataStream -- OR: disconnect race condition detection

Mon Sep 10 15:27:17 CEST 2012

On segunda-feira, 10 de setembro de 2012 14.53.31, Till Oliver Knoll wrote:
> > If TCP is reliable by doing ACKs, why the hell would you need to track
> > ACKs
> > too?
> 
> Because, since that information is around anyway, why not use it (apart from
> that it seems not accessible in real world implementations)?

Aside from statistics and curiosity, I don't see the point in accessing the 
information.

> As in "I want to sent a message of 1000 bytes. Unless the counterparty's
> Transport layer doesn't acknowledge the receipt of 1000 bytes, I assume
> something went wrong (and take action by re-sending the message after some
> while in an "at least once" scenario)".

TCP will already resend it for you. So you're not resending it in the same 
connection, you're talking about disconnecting and reconnecting, and hoping 
that it will work better.

You could use TCP keepalives for this.

> That is, the application *wants* to know the number of bytes received by the
> counterparty, but doesn't want to apply an Application level protocol.

But it doesn't know how many bytes the peer has received in the Application 
layer. It can only know the number of bytes received in the Transport layer. 
That's the "statistics and curiosity" part I mentioned.

> For instance, the scenario could be that the "network" is the *only* concern
> to the application, or in other words: once the counterparty's Transport
> layer has ACK'ed the packets, you know (or rather: expect) that "all will
> be fine" (and if not, the counterparty just blew up and the connection will
> be closed and you'll know the next time you try to send a message).

That's a huge assumption. There are a number of reasons why a packet's 
contents may never be delivered, including the system rebooting between the 
ACK and the application reading the data. You're not accounting for that or 
many other reasons.

Therefore, it's a bad assumption to make. Don't design applications like that.

> So I might turn around the question: "Why would you NOT want to track (or
> rather: get informed about) the ACK which is done by the underlying layer?"

Because it's none of your business. Doing that means your application depends 
on the inner workings of the underlying layer and will not be portable to 
other transport mechanisms (SCTP, Unix sockets, etc.). It might also imply bad 
assumptions because TCP packets can be received out of order, with the peer 
including the information of which further packets it has already received.

And because there's no easy way of getting the information anyway.

> And coming back to the OP's actual question: currently you can't figure out
> whether the last n bytes did get through or not. Either you check the
> connection *before* you start sending (in which case the connection might
> just break the moment you start sending), or you check *after* (some while)
> whether the connection is "still good" (in which case you still don't know
> for sure wheter all, some or non at all of your bytes have been sent and
> ACK'ed by the counterparty).

A simple "happens-before" relationship: if the peer does something in response 
to a later message, then we know it has received the earlier message.

Note that "has received" does not imply "has acted". Some protocols are 
stream-oriented and still do permit out-of-order actions.

> > It's redundant: if you get the ACK information, that's because TCP got it
> > too;
> 
> ... and your application knows that at least the counterparty's Transport
> layer has received the data. If that's all you care about then you gave
> gained "information".

My point is that no application in the real world will be satisfied with that 
information.

> > if you don't get it, neither did TCP.
> 
> But you can react upon it (after some timeout), whereas if you don't get
> this information at all you're left in the dark whether to re-send the last
> n bytes.
> 
> That's why one might want to track those ACKs.

That's what TCP keepalives are for. Besides, you probably want to act on 
entire messages, not N bytes.

If the application crashes, you will eventually get an RST. Knowing how many 
bytes it received is pointless: since it crashed or was killed, it state was 
inconsistent. You cannot depend on the fact that the N bytes it received were 
acted on. And that's even if you could determine that it did, after all, read 
the bytes that the Transport received.

If the application exits cleanly, you'll get a FIN. But if it exited cleanly 
byt unexpectedly, it's a violation of the Application protocol and clearly the 
state was inconsistent. We go back to the case above.

If there's a network breakdown along the way, no packets will be received 
either way. This is where a timeout is useful for and TCP already has one. It 
might be way too long for modern-day applications, but it exists.

If the peer reboots, then you'll either get an RST or TCP keepalive will be 
necessary.

In any of those scenarios, knowing how many bytes the remote's TCP stack 
received is pointless.

> Assume you're now on some device with a dead slow network connection and you
> want to upload some data. Additionally you want to show some progress bar
> how many bytes have already been received (hint! hint!) by the
> counterparty. Let's further assume the data fits fully into the send buffer
> (which is on its turn much bigger than the packet size).
> 
> Let's have a look at a first naive implementation: you would use
> QIODevice::bytesWritten to update the progress bar and -bang!- you're from
> 0 to 100% in no time! Because we just filled the sender buffer, but maybe
> did not even send a single file onto the wire just yet!

Right.

> Now if we *had* the information about the *received bytes* (note again: here
> it is totally irrelevant what the receiver does with those bytes!) we could
> of course update the progress bar in a much more useful manner.

Right. But you could also update it based on the number of TCP packets sent 
but still not acknowledged.

And in your case, given modern TCP, you might *still* jump from 0 to 100%. 
Suppose that the first PSH packet is lost, but all others are received. The ACK 
count is still at 0, even though the peer has received about 95% of the data. 
Upon retransmission of that single packet, the remote ACKs the full message. 
Then your progress bar will jump from 0 to 100%.

And *even* if this convoluted scenario doesn't happen, it might *still* jump 
due to delayed ACKs. The remote does not need to ACK every single TCP packet. 
As long as it ACKs byte N, we know it received 0..N.

> Now you could still argue that this should be solved by an application level
> protocol, e.g. the receiver should send back the number of bytes it has
> processed or whatever. But a) the receiver's code might not be ours (we
> cannot modify it) and b) I might again turn around Thiago's question: "Why
> should the application duplicate what the underlying level is doing
> anyway?"

If the bandwidth is THAT small (think of dial-up modems under 14.4 kbps), the 
overhead of notifying that the packets are received would probably be 
considerable.

If all you want to know is exactly what TCP already knows, then double 
bookkeeping would be unfortunate. My point is not that. My point is that 
relying on the TCP ACKs can only be done with assumptions that the application 
should not be making in the first place.

> And all of a sudden it *seems* that the information about the received bytes
> is not *that* useless, after all, is it?
> 
> (By the way, that progress bar with a slow connection was exactly the
> initial motivation from the previous Stack Overflow question I referred to
> in my previous post).

Every buffer introduces a delay. See "buffer bloat".

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
     Intel Sweden AB - Registration Number: 556189-6027
     Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden