[Development] Proposing QUIP-23: Qt-Security header in source code files

Thu Jul 11 16:39:01 CEST 2024

> On 11 Jul 2024, at 15:51, Giuseppe D'Angelo <giuseppe.dangelo at kdab.com> wrote:
> 
> On 11/07/2024 15:21, Volker Hilsheimer wrote:
>> For many APIs, application code provides the data (perhaps indirectly),
>> e.g. to QDateTime::fromString. In that case we can assume that the
>> application had at least some chance to scrub the input, or at the very
>> least control where that string comes from (perhaps a file on disk). For
>>  other APIs, Qt processes the data without the application seeing it
>> (eg. network protocol, loading an image etc from file).
> 
> I'm not too sure I appreciate the difference here. Either the input is trusted (= the onus of validating it, if any, is on the application / system side), or it is not (= Qt can't assume anything about it and must validate it).

To the builder of the application the difference is there:

 “you can throw anything at QImage::loadFromData and Qt will not misbehave” (and if it does, then it’s our problem and has to be treated as a security issue)

vs

“the input to QString::asprintf has to be valid or the behavior is undefined” (and if things crash, you have to fix it, and it’s not a security issue in Qt).

In your words: can Qt trust the input that it gets from the application? I think the answer is “sometimes”: it won't trust loadFromData, because Qt uses that function internally; but it will trust that the input to asprintf is valid, and wont’ perform any validation.

Or more prominent examples because those regularly causes discussion: Qt trusts all data a QDataStream operates on, or that the QML engine receives, so the application developer is responsible to make sure that the input is from a trusted source. But Qt doesn’t trust SVG input, so application developers can assume that any SVG that breaks Qt SVG is our problem to fix.

>> To document the respective expectations and responsibilities on a higher
>>  level, we need start with understanding and documenting what the code
>> does. The header helps us with that, and at the same time enables some
>> degree of automation.
> 
> Fair enough, but then I'd kindly ask to reframe this discussion with this in mind; that is, this isn't about "security" in general, it's about untrusted inputs. I'm not sure what buzzword to use here, though.

I’m not sure if finding another word, with or without buzz, is useful. When there are other attack vectors than “throw maliciously formatted data at Qt” to think about in this context, and when those will also be manifest in the code in some form, then we’d like to use the same tool.

So over time, won’t this be “security in general”? That it’s about cybersecurity aspects that manifest in the code of Qt (rather than social engineering attacks to get approval rights) seems somewhat obvious from the nature of the solution, doesn’t it?

> So what is the plan of action?
> 
> * Define what "external inputs" are?
> * Identifying code in Qt that processes such external inputs?

We have to some degree already started with that when establishing clearer rules for our third party dependencies:

https://wiki.qt.io/Third_Party_Code_in_Qt

> * Figure out whether such code deals with trusted or untrusted inputs, and add relevant notes in the documentation (where?)?

The plan is to start with tagging the code, as proposed in the QUIP. From there, we can build towards user-facing class and API documentation (the file with the relevant code might be an implementation detail that impacts several public APIs).

> * If it's untrusted, figure out whether Qt is directly responsible for parsing the input, or if Qt is just offloading it to a 3rd party (e.g. image formats), or possibly both?

Does that matter? 3rd party code that we pass data through to is just an implementation detail.

> * Tag all the files that contain such code according to some schema?
> * (Possibly, refactor the code in separate .cpp files to isolate it, so that the tagging can be "accurate"?)

Indeed, that’s could be one possibly useful application of the tagging.

> * Check that we have fuzzing, ubsan, etc. enabled on Qt code that parses untrusted inputs?

That’s the mid-term goal, yes.

Volker