[Development] Decrease amounth of delivered src packages

Matthew Woehlke mwoehlke.floss at gmail.com
Fri Feb 17 17:20:20 CET 2017


On 2017-02-15 18:11, Mathias Hasselmann wrote:
> The actual value of gzip and the reason for its return seems to be its
> significantly lower CPU usage. This is useful to reduce server load on
> heavily utilized services like github. This is useful to reduce
> development roundtrip cycles.
> 
> Just to illustrate this I've collected some numbers on my Thinkpad:
> 
>   Command | Average  | Savings  | Archive | Savings
>           | CPU time | p. build | size    | @100MBit
>  ====================================================
>   gzip    | 00:44.19 | 09:54.58 | 469 MiB | 00:00.00
>   gzip -9 | 01:55.58 | 08:43.18 | 465 MiB | 00:00.32
>   xz      | 08:46.16 | 01:52.60 | 354 MiB | 00:09.20
>   xz -9   | 10:38.77 | 00:00.00 | 333 MiB | 00:10.88
> 
> Is it worth to spend 10 additional minutes per CI cycle just to save our
> users a very few seconds of download time?

Um... yes? 10 min * once = 10 s * 60 users. Do you imply that fewer than
60 users use the .xz packages? Remember, that's not just *user* benefit,
that is also 10 s *per download* less load on the servers. Compression
happens once; downloads happen many times.

OTOH...

A better metric is decompression, which also happens many times. On one
of my machines, it takes¹ about 8.5 s to uncompress the .gz (to
/dev/null) and about 19 s to uncompress the .xz (also to /dev/null).
So... it does cost about 10 s more CPU time to uncompress the .xz. That
being the case, I'll grant that *if* you're on a sufficiently high-speed
network, maybe it doesn't make sense to download the .xz.

(¹ I ran 20 trials each to reduce artifacts from caching, etc. I got
very consistent times, so I have high confidence that these numbers are
reasonable.)

-- 
Matthew



More information about the Development mailing list