[Development] Decrease amounth of delivered src packages
Matthew Woehlke
mwoehlke.floss at gmail.com
Fri Feb 17 17:20:20 CET 2017
On 2017-02-15 18:11, Mathias Hasselmann wrote:
> The actual value of gzip and the reason for its return seems to be its
> significantly lower CPU usage. This is useful to reduce server load on
> heavily utilized services like github. This is useful to reduce
> development roundtrip cycles.
>
> Just to illustrate this I've collected some numbers on my Thinkpad:
>
> Command | Average | Savings | Archive | Savings
> | CPU time | p. build | size | @100MBit
> ====================================================
> gzip | 00:44.19 | 09:54.58 | 469 MiB | 00:00.00
> gzip -9 | 01:55.58 | 08:43.18 | 465 MiB | 00:00.32
> xz | 08:46.16 | 01:52.60 | 354 MiB | 00:09.20
> xz -9 | 10:38.77 | 00:00.00 | 333 MiB | 00:10.88
>
> Is it worth to spend 10 additional minutes per CI cycle just to save our
> users a very few seconds of download time?
Um... yes? 10 min * once = 10 s * 60 users. Do you imply that fewer than
60 users use the .xz packages? Remember, that's not just *user* benefit,
that is also 10 s *per download* less load on the servers. Compression
happens once; downloads happen many times.
OTOH...
A better metric is decompression, which also happens many times. On one
of my machines, it takes¹ about 8.5 s to uncompress the .gz (to
/dev/null) and about 19 s to uncompress the .xz (also to /dev/null).
So... it does cost about 10 s more CPU time to uncompress the .xz. That
being the case, I'll grant that *if* you're on a sufficiently high-speed
network, maybe it doesn't make sense to download the .xz.
(¹ I ran 20 trials each to reduce artifacts from caching, etc. I got
very consistent times, so I have high confidence that these numbers are
reasonable.)
--
Matthew
More information about the Development
mailing list