[Development] Decrease amounth of delivered src packages

Mathias Hasselmann mathias at taschenorakel.de
Thu Feb 16 00:11:49 CET 2017



Am 15.02.2017 um 13:05 schrieb Dmitry Shachnev:
> On Wed, Feb 15, 2017 at 12:28:11PM +0100, Kevin Kofler wrote:
>> I would kindly request you to at least use tar.xz (rather than tar.gz) for
>> the tarballs. (What you use as the Windows format is something you need to
>> sort out with the Windows people.) The fact that tar.gz is still the most
>> downloaded is probably mostly out of habit, or maybe your download site is
>> directing to them by default (which ought to be fixed anyway, even if you
>> were to keep both). tar.gz has no advantage over tar.xz, it is just a lot
>> larger. Switching to the tar.gz tarballs (from the tar.xz tarballs that are
>> currently used) would increase the size of distributions' source packages
>> (source RPM etc.) significantly.

If distros care about size they can re-compress.
Well, and to my experience they do.

>> It is sad that the legacy gzip compression is living a renaissance due to
>> automatic tarball exports from GitHub and the like producing only that
>> format. It should finally be retired now that there are algorithms that are
>> just as open and that compress significantly better. At least for projects
>> like Qt that produce their own tarballs and are already able to produce xz-
>> compressed ones, I see no reason whatsoever to switch back to the obsolete
>> gzip.
>
> +1, please leave tar.xz instead of tar.gz.
>
> Users of all modern UNIX-like systems are able to decompress tar.xz, so .gz
> has really no advantage over .xz.

That's a somewhat limited point of view. Yes, xz archives are slightly 
smaller, but to be honest: In the days of 4K video streaming saving 
100MiB of download size doesn't seem as important as it was.

The actual value of gzip and the reason for its return seems to be its 
significantly lower CPU usage. This is useful to reduce server load on 
heavily utilized services like github. This is useful to reduce 
development roundtrip cycles.

Just to illustrate this I've collected some numbers on my Thinkpad:

   Command | Average  | Savings  | Archive | Savings
           | CPU time | p. build | size    | @100MBit
  ====================================================
   gzip    | 00:44.19 | 09:54.58 | 469 MiB | 00:00.00
   gzip -9 | 01:55.58 | 08:43.18 | 465 MiB | 00:00.32
   xz      | 08:46.16 | 01:52.60 | 354 MiB | 00:09.20
   xz -9   | 10:38.77 | 00:00.00 | 333 MiB | 00:10.88

Is it worth to spend 10 additional minutes per CI cycle just to save our 
users a very few seconds of download time?

Looking at the problems behind providing an official 5.8.1. Looking at 
my personal experience with slow CI systems I clearly vote for speeding 
up the Q/A process and sticking with .zip and .tar.gz.

Besides: Does it really make sense to fully test the expensive .xz and 
.7z builds? In my opinion it would be fully sufficient to only give the 
inexpensive .zip and .gz archives full test coverage. At least the .xz 
could be generated on the fly after uploading to the web server by 
simply decompression:

   gunzip < qt-sources.tar.gz | xz -9 > qt-sources.tar.xz

Ciao,
Mathias



More information about the Development mailing list