[Development] std::format support for Qt string types

Ivan Solovev ivan.solovev at qt.io
Fri Jun 7 10:53:55 CEST 2024


Hi

> I think we should conceptually separate formatting from printing on a
> terminal. std::format isn't _just_ for printing on terminals

I agree. But the same question about encoding to be used is still valid here.

> What do you mean by "readable" here?

I was mostly thinking about "readable in the terminal", because that's how
I did most of my tests. And from that point of view "readable" is very
close to what QString::toLocal8Bit() tries to do. So that's what I called
option 2 in my initial email.

> I'm not following this. If I do
>
>  std::format("{} {}", utf8string, latin1string)
>
> what am I supposed to get out? A string which is a mix of two different
> encodings? I don't think that's ever possibly wanted.

Yes, that's exactly what I mean. And, by the way, that's exactly how
std::format is working now.
If you write something like this:

 std::string utf8{"\xC3\x84\xC3\x96\xC3\x9C"}; // ÄÖÜ in UTF-8
 std::string latin1{"\xC4\xD6\xDC"}; // ÄÖÜ in Latin1
 std::string buffer;
 std::format_to(std::back_inserter(buffer), "{} {}", utf8, latin1);

Then the resulting buffer will simply contain
"\xC3\x84\xC3\x96\xC3\x9C \xC4\xD6\xDC".

So, std::format does not care about the encodings.

> The concern I was quoting before is this: suppose that tomorrow we have
> a formatter for `const char16_t *` into char. This formatter does some
> kind of transcoding. Then QString(View) ought to do precisely the same!
> If we take a different decision now, we risk having compatibility
> problems down the line.

Standard in its current state does not care about transcoding (see the
utf8 + latin1 example above). So, if we do not go for option 3 at least
for QLatin1StringView and QUtf8StringView, we would already have
compatibility problems.

Basically, if we want to strictly follow the standard, we need to do
the following:

4. Provide only the formatters that are supported by the standard

Note that currently the standard does not allow to mix char and wide char
strings, so something like this will not work:

 std::wstring wstr = ~~~;
 std::format("{}", wstr); // ERROR!

That would mean the following:
* std::formatter<QString(View), wchar_t>, which will simply call
 QString::toStdWString() and format the resulting std::wstring.
* std::formatter<QUtf8StringView, char>, which will simply format
 QUtf8StringView::data().
* std::formatter<QLatin1StringView, char>, which will simply format
 QLatin1StringView::data().
* All other combinations would not be implemented.

However, in my opinion such approach does not add anything useful to Qt.

>> Now, I don't really know if formatting char16_t is anywhere on SG16's
>> radar in the short term, but that sounds definitely something to
>> investigate and report about, in order to make a more informed decision.
>
> Yup, I think we need to engage SG16 before continuing with this. We need
> char16_t to be enabled on the generic formatters, for one thing.

I must admit that I have no idea how this process works. How to reach out
to them and ask if they have any plans about char16_t support?
Maybe even asking if they have any plans for wchar_t -> char formatters
would be helpful.
Is there anyone familiar with the process?

Best regards,
Ivan


------------------------------

Ivan Solovev
Senior Software Engineer

The Qt Company GmbH
Erich-Thilo-Str. 10
12489 Berlin, Germany
ivan.solovev at qt.io
www.qt.io

Geschäftsführer: Mika Pälsi,
Juha Varelius, Jouni Lintunen
Sitz der Gesellschaft: Berlin,
Registergericht: Amtsgericht
Charlottenburg, HRB 144331 B
________________________________
From: Development <development-bounces at qt-project.org> on behalf of Thiago Macieira <thiago.macieira at intel.com>
Sent: Thursday, June 6, 2024 7:08 PM
To: development at qt-project.org <development at qt-project.org>
Subject: Re: [Development] std::format support for Qt string types

On Thursday 6 June 2024 08:07:31 GMT-7 Giuseppe D'Angelo via Development
wrote:
> I'm not following this. If I do
>
>   std::format("{} {}", utf8string, latin1string)
>
> what am I supposed to get out? A string which is a mix of two different
> encodings? I don't think that's ever possibly wanted.

Agreed and I noted this in the initial implementation by Ivan, which
eventually led us here.

My opinion is that we should treat formatter-to-char as UTF-8 and therefore
copy the UTF-8 string literally and transcode the Latin1 string (preferably,
on the fly without temporary memory allocation, like we do for QStringBuilder).

> > Question here is how to deal with QString(View)?
> >
> >   3a. Convert it to UTF-8, because that's the pre-existing behavior which
> >
> >       should be known for the users.
> >
> >   3b. Do not implement std::formatter<QString(View), char> at all and let
> >
> >       the users explicitly convert QString to something else first.
> >
> > Option 3b is inconvenient and defeats the purpose of std::format support
> > for Qt types, so I'd personally prefer 3a here.
>
> The concern I was quoting before is this: suppose that tomorrow we have
> a formatter for `const char16_t *` into char. This formatter does some
> kind of transcoding. Then QString(View) ought to do precisely the same!
> If we take a different decision now, we risk having compatibility
> problems down the line.
>
> Now, I don't really know if formatting char16_t is anywhere on SG16's
> radar in the short term, but that sounds definitely something to
> investigate and report about, in order to make a more informed decision.
>
> (Not to mention formatting _into_ char16_t, which would unlock something
> like QString::format to *create* a QString!)

Yup, I think we need to engage SG16 before continuing with this. We need
char16_t to be enabled on the generic formatters, for one thing.

It might be that we're giving feedback 5 years too late, though.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Principal Engineer - Intel DCAI Fleet Engineering and Quality
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20240607/95c0f5d8/attachment-0001.htm>


More information about the Development mailing list