[Interest] Operator QMap<uint, uint>[] is casting to int?

Roland Hughes roland at logikalsolutions.com
Sat May 11 18:02:34 CEST 2019

On 5/11/2019 5:00 AM, Thiago Macieira wrote:
> On Friday, 10 May 2019 04:17:03 PDT Roland Hughes wrote:
>>> Data is different. std::byte is unsigned and "unsigned char" is the actual
>>> definition of byte. QByteArray should actually get an API to treat its
>>> contents as bytes, not just chars.
>>> But we weren't talking about data, we were talking about metadata: sizes,
>>> indices, offsets.
>> I was talking about the devices needing the sizes sent in unsigned
>> octates. Sorry if there was any confusion. The static casts most often
>> occur here.
> Only if we're talking about sizes between 0 to 255. That's what a single
> unsigned octet can represent. Doesn't strike me as very useful.
> If you meant transmitting sizes as unsigned values over the network or medium,
> that's fine, since negative counts are physically meaningless. The bit pattern
> of any size in signed and unsigned is the same, so the choice between them is
> irrelevant. You may as well treat the protocol field as signed.

No, we are talking about numerous different size fields which are 
multiples of unsigned octates in size. Much of this hardware is 
non-Intel and some isn't even married to 8-bit bytes. To make the topic 
accessible to those who only know Intel based hardware we use the term 
octate. The size field is some multiple of octate.

We aren't talking about transmitting via network or some other medium. 
We are talking about 3rd to 5th party hardware being supplied which is 
also being developed for other projects. We aren't allowed to have 
direct contact with the firms (in some cases we do not even know who 
they are) but the hardware will exist on board. Some of it will have its 
own processor. Communication for some is via shared on-board memory, 
others is via writing to a port.

In all cases the vendors developing said hardware provide C header files 
providing the message/data content which must be used because the device 
coupled with this exact code is what went through regulatory approval. 
Changing even one data type in one structure from unsigned to signed 
means YOU have to go through full regulatory approval and/or testing. 
Some of that testing involves a clinical trial which can run years.

All C/C++ code compiled under a flavor of Linux is required to at a 
minimum use

"-Wall -Wextra -Werror"

This leaves developers in said environment the following options:

1) Don't buy/use Qt.

2) Create a central library function called from _everywhere_ hoping 
nobody notices

3) Don't buy/use Qt.

4) Creating a wrapper class for each Qt container just to get an uSize() 

5) Don't buy/use Qt.

6) scatter static_cast<>() all over the place and generally fail review.

7) Brute force via memmove() instead of using an assignment operation 
and _hope_ formal regulatory review doesn't call you on it.

I put the "Don't buy Qt" in there multiple times not to be a jerk but 
because it is an option I'm seeing companies increasingly take, 
including companies which already have products using Qt in the field. 
Jumping through the hoops simply isn't worth it anymore.

> I understand your argument that the other side may have specified that the
> field is unsigned. That's fine. But my argument is that all values between 0
> and 9223372036854775807 have the exact same bit pattern in both signed and
> unsigned 64-bit.

Yes they do, until someone "gets cute." Actually it takes at least 2-3 
levels of "cuteness."

To get around being failed for too many static_cast<>() calls, usually 
after the first approval failure, they create a union datatype.

union {

int32_t size;

uint32_t usize;

} bad_idea;

They'll even typedef it to hide the thing further. In all places they 
are receiving an int from a size/length/index/whatever they stuff it 
into the size field and when they have to communicate with the onboard 
device they use usize. Automated testing, which isn't testing at all, 
doesn't execute one or more paths where something was missing and size 
got a -1 like, say some config information didn't load so the drug 
concentration string was missing or incomplete and the indexOf() 
returned -1 instead of 0 and a bool indicating failure. Maybe they 
actually checked for the negative one at the time, but they didn't clean 
up the mess. When it came time to tell the infusion pump (or whatever) 
how much to pump, they just take the usize value and thump it into the 
control packet.

That's a biiiig number. Way more than the 60 mL the patient was supposed 
to get. Maybe you are lucky and the cassette only had 60 mL in it and 
when the pump ran dry it shut the device down. Maybe that's a Chemo pump 
where it was supposed to intermittently spurt a doze into the patient 
over the next 1-2 weeks when they were sent home with the fanny pack and 
the pump just gave them the whole cassette in one shot. They died. All 
forms of Chemo are some form of poison. In small doses it mostly kills 
what it is supposed to, in big doses it kills the patient.

> However, I can now think of one particular case where unsigned math may be
> useful: dealing with untrusted data (and network definitely is untrusted).
> Until you confirm that all values are in the expected range, you should keep
> it unsigned.
Trusted data is the most deadly.
> No, we do not "really really" need that.
Yeah, we "really really" do need it.
> Just because one field can't be negative does not mean that others can't. Take
> the example of a string / vector / bytearray size. As I said above, sizes
> cannot meaningfully be negative. But functions like indexOf() take an offset
> position that can be negative and the function can return a negative value
> indicating that the search failed. Using unsigned for size() is much more
> likely to cause you to need casts, for no appreciable gain since you cannot
> have more items in that vector or string than the maximum signed value.
And now we get to the root of the resistance. Long long ago in a 
programming galaxy far far away someone took a shortcut. Rather than 
returning a "safe" value and a boolean, someone took a shortcut. We've 
been paying the price for that shortcut ever since.
>> Then we only have to jump through hoops for things like DICOM.
>> https://www.dicomstandard.org/faq/
>> ====
>> Following the (group number, data element number) pair is a length field
>> that is a 32 bit unsigned even integer that describes the number of
>> bytes from the end of the length field to the beginning of the next data
>> element.
>> ====
> This is fine, but since you can't operate on this much data on a 32-bit system
> without implementing windowed access through the data in your own code, I
> don't see as in impediment. You will code your application to deal with a
> possible 4 GB distance by seeking in the file -- and remember off_t is signed!
> If you only allow your application on 64-bit systems, then qsizetype can
> represent all 32-bit unsigned values without loss of information.

It's not a matter of "operate on" it is a matter of construction and 
deconstruction. You either have to not use Qt anywhere in your project  
or you have to static_cast<>() every size call for every QByteArray that 
contains each field/piece of the data. Every one of those has to be 
documented and defended. During formal regulatory review the magic 
failure threshold seems to be based almost as much on if the head 
reviewer just got kicked out of the house by the spousal unit or the 
lunch they had than on higher numerical values.

Said failure happens after the device has been through full QA. 
Sometimes after first field trial because you have to prove the device 
works before the review. Changes require doing (and paying) for all of 
that again.

This is why a big chunk of the medical device world is starting to turn 
its back on Qt.

unsigned usize( bool * ok = nullptr);

What you call ugly and a hack other environments call software engineering.

I agree. The indexOf() hack has been in Qt too long. The other hacks 
opting to return an unsafe -1 instead of a "safe" 0 and a bool have been 
there too long. You can't do it right without gutting the library and 
breaking lots of stuff in the field.

You can add methods which allow for good software engineering. For 
everyone of the existing hacks you can add a "u" version which returns 
an unsigned value and on failure a 0 while setting the failure bool.

Roland Hughes, President
Logikal Solutions
(630)-205-1593  (cell)

More information about the Interest mailing list