[Development] Two-digit dates: what century should we use ?

André Somers andre at familiesomers.nl
Wed Nov 6 17:20:39 CET 2019


Hi,

On 05-11-19 14:44, Edward Welbourne wrote:
> Hi all,
>
> Prompted by [0], I'm looking at what century to use for years, when the
> text being read is expected to be in a "short format" that only includes
> two digits.
> * [0] https://bugreports.qt.io/browse/QTBUG-74323
>
> tl;dr - how do folk feel about (in Qt 6) a century-wide window, ending a
> decade or three ahead of QDate::currentDate(), and placing any two-digit
> year in that range ?
>
> Before anyone says "Don't Do That" (or "why would anyone use two-digit
> years after the mess of y2k ?"), bear in mind that CLDR (the Unicode
> consortium's common locale data repository, on which QLocale's data is
> based) provides short date formats, many of which use two-digit years.
>
> We currently fail to round-trip dates via such formats because 1900 is
> used as default year when no year is specified and (thus) 19 is used as
> default century number when only the later digits are (understood to be)
> specified.  As we get further into the twenty-hundreds (as it were), this
> shall grow to be an increasing jarring flaw in date format handling.
>
> I'm considering changing that: since it's a material behaviour change,
> it clearly needs to happen as part of Qt 6, which at least gives me a
> few months to discuss it and see what folk think is a better plan than
> what we have.
>
> It's notable that ECMAScript's Date constructor adds 1900 to any year
> number from 0 through 99 (even if supplied as one of a sequence of
> integer arguments, not a string), causing problems for the
> representation of dates from 1 BCE through 99 CE.  (I must remember to
> tease my friend on the ECMA 262 committee about that - his excuse will
> be that it was copied from an early version of Java, I suspect - and see
> if he can coax them into changing it.)  Likewise, C's struct tm (used by
> mktime and friends) has a 1900 offset on its year number: that's
> probably never going to change, perverse as it is and shall increasingly
> be.
>
> Folk still talk about "The fifties" and mean the 1950s; probably
> likewise the forties, thirties and even twenties.  That last, at least,
> shall soon be something of a problem.  Folk can see more of the past
> than of the future, so perhaps it's not much of a surprise that common
> nomenclature reserves short phrases for the past at the expense of the
> future: "The sixties" shall be in the past for a few decades yet, I
> think.  So rather than having a default century, and maybe changing it
> abruptly to 20 at some point in the next fifty years, I think it would
> be better to have two-digit years coerced into a century-wide window
> about the (forever moving) present.
>
> Perhaps we should make that a narrower window and treat roughly a decade
> near the wrap-around as error - e.g. using 1945--2035 as our year range,
> with two-digit years 36 through 44 treated as undecodable.
>
> The question then arises: what year-range should we use ?
>
> Two things I'm fairly sure should be true are:
> * the current year (i.e. QDate::currentDate().year(), naturally) should
>    be included in the range;
> * the range should be contiguous.
>
> So the interesting questions are:
> * how far into the past and future should the range reach ?
> * how wide a buffer (if any) should we leave ?
>
> If we don't have a buffer, my inclination is to put the transition date
> at a decade boundary, e.g. 49 -> 2049 but 50 -> 1950, as this shall feel
> less perverse to most folk than having a mid-decade transition such as
> 44 -> 2044 but 45 -> 1945.  However, with a buffer, this problem goes
> away, as there aren't adjacent two-digit numbers that map to wildly
> different years; instead, the intervening numbers that aren't handled
> make the discontinuity seem more sensible.  In principle a one year
> buffer would suffice, but I'm inclined to make the gap a decade long, or
> more, if we have one.
>
> If QDate::currentDate().year() is C and (C / 10) * 10 is D, either of
> these ranges strikes me as better than the 1900--1999 that we're
> currently using:
> * D -70 <= year < D+30 (all two-digit values handled)
> * C -65 <= year <= C +25 (othet two-digit values rejected)
>
> So, to my questions:
> * Does anyone want to make the case for keeping 1900--1999 as range ?
> * Has anyone a better suggestion for how to chose a rolling range ?
> * Should we have a buffer ?  If so, how wide ?
> * How far into the past and future should the range reach ?


I came to the conclusion that the sane behavior for interpreting dates 
depends on the semantics of what the date means. For instance, a birth 
date will always be a date in the past, while a date for an appointment 
would normally be a date in the future. That alters the interpretation 
of the date. May I suggest adding an enum argument to any function doing 
the conversion from a string to a date that allows you tell you to 
suggest the kind of date that is expected?

I think it would come in four flavors. 1) A default `aroundCurrentDate` 
value or something like that that would result in using a window where 
the date is interpreted as being in the window of the current year - 50 
and the current year + 50 years. 2) A `pastDate` value that would result 
in interpreting the date as meaning anywhere in the past 100 years. 3) A 
`futureDate` value that would result in interpreting the date as 
anywhere in the future 100 years, and 4) a `legacy1900` value that would 
revert back to the current behavior and just adds 1900.

Cheers,

André

> 	Eddy.
> _______________________________________________
> Development mailing list
> Development at qt-project.org
> https://lists.qt-project.org/listinfo/development


More information about the Development mailing list