[Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
joao.abecasis at nokia.com
joao.abecasis at nokia.com
Fri Dec 23 11:47:40 CET 2011
Denis Dzyubenko wrote:> 2011/12/9 João Abecasis <joao.abecasis at nokia.com>:> >> inline QUuid QUuid::createFromName(const QUuid &ns, const> >> QString &name)> >> {> >> return createFromName(ns, name.toUtf8());> >> }> >> > would only be updated to call the right implementations, as> > appropriate.> > I like the current status of the patch very much.> > However I have one question - where utf8 comes from? Shouldn't it be> defined by rfc, and if not imo we shouldn't arbitrary choose> encodings, and maybe leave the default one in - which is utf-16 for> QString
This is my reasoning:
1) As you mention the RFC doesn't specify encodings. In fact, it saysthe owner of a namespace is free to decide how it should be used. Forthis reason it's important that we support QByteArray as the canonicalform and let users make conscious decisions.
2) In Qt, strings of text are represented as QString so it would be niceto support QString-based names. This is the reason for adding thoseoverloads as convenience API, but doesn't tell us how QString-basednames should be translated to "a canonical sequence of octets" (quotingthe standard).
3) The point of name-based UUIDs is that you can regenerate the UUIDsknowing only the namespace UUID and a particular name. If you use theQByteArray version, it's up to you to ensure this. When using the QStringversion Qt needs to ensure it for you.
This excludes locale- and system-dependent conversions, liketoLocal8Bit(), it also excludes straightforward utf16() as it isdependent on endianness, and thus platform.
4) UTF-8 is a good candidate because it is one possible "canonicalsequence of octets". But it's mostly that, a good candidate.
So, there isn't a reason why it *has* to be utf-8, but I haven't seenbetter alternatives. Other alternatives are toAscii or toLatin1, butthey're lossy encodings. Network-byte order UTF-16?...
Anyway, one use case mentioned in the standard makes this convenienceapproach very nice:
QUrl url;
// ...
// NameSpace_DNS from RFC4122 // {6ba7b810-9dad-11d1-80b4-00c04fd430c8} QUuid nsDns(0x6ba7b810, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00, 0xc0, 0x4f, 0xd4, 0x30, 0xc8);
QUuid uuidForUrl = QUuid::createFromName(nsDns, url.toString());
With the added benefit that in that use case it interoperates withPython.
("And what does python do?", you ask. Well, it avoids the decisionaltogether and bails out on unicode strings. It only accepts abyte-strings:
$ python Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> uuid.NAMESPACE_DNS UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8') >>> uuid.uuid3(uuid.NAMESPACE_DNS, "www.widgets.com") UUID('3d813cbb-47fb-32ba-91df-831e1593ac29') >>> uuid.uuid3(uuid.NAMESPACE_DNS, u"www.widgets.com") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/uuid.py", line 512, in uuid3 hash = md5(namespace.bytes + name).digest() UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 1: ordinal not in range(128)
)
What do others think?
Cheers,
João
More information about the Development
mailing list