[Interest] good-compromise compatibility setting for -march=??? option (x86)?

Thiago Macieira thiago.macieira at intel.com
Mon Aug 10 00:47:46 CEST 2020


On Sunday, 9 August 2020 15:28:26 PDT René J. V. Bertin wrote:
> > BTW, you may want to also add the -mtune option to either of the two
> > processors. Leaving it at the default (matching -march) may not produce
> > the
> > best code for either.
> 
> Oh? I've always understood that -march implies -mtune (but not vice-versa)?!

-march=X implies -mtune=X, but you can add -mtune=Y to get the instruction 
models for Y. For obvious reasons, Y > X, since it makes no sense to optimise 
for a processor generation that couldn't possibly run the code generated.

> Is Clear Linux really noticeably faster and because of its build options? I
> can't really tell from the Phoronix comparison between it and Ubuntu 20.04
> how conclusive the difference is in every day life (if that doesn't imply
> one of the benchmarks where the difference is going to be noticeable).

That's "YMMV" for you. Clear Linux wins in enough benchmarks to prove that 
there is something behind it all. But whether that affects your daily life, 
that's another story. Clear Linux isn't optimised for desktop experience, 
aside from a lightning quick boot time. The moment you open Chrome or Firefox 
((both of which you have to download outside the distro, unfortunately), 
you're in a completely different world.

But if your daily life is running servers and containers, even a 1% 
improvement means money saved. Also why a lightning quick boot is important: 
if your workload takes 5 min to run and your distro took 30s to boot instead 
of 2.5s, that's 10% overhead.
 
> Meanwhile I had settled on an annoyingly long option string that has -
> march=core2 and sets every SSE and MMX version support (except SSE4a).
> R.

You should use -march=westmere instead of core2.

-march=core2 implies:
<https://code.woboq.org/gcc/gcc/config/i386/i386.c.html#_M/PTA_CORE2>
-msse -msse2 -msse3 -mssse3 -mcx16

Whereas -march=westmere is
-msse -msse2 -msse3 -mssse3 -mcx16 -msse4.1 -msse4.2 -mpopcnt -mpclmul

(note: the woboq indexed source is old; it still shows -maes for Westmere, but 
that got changed some time ago with a patch of mine; the definitions are also 
in i386.h now and aren't macros anymore)



-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering





More information about the Interest mailing list