Bogdan Vatra wrote:
> gcc -O2 and -O3 gives the worst/random/surprising results. On the other
> hand, clang (to my HUGE surprise) gives the most consistent results.

That doesn't surprise me all that much. I've seen REALLY strange 
benchmarking results from GCC-generated code. In particular, I had a program 
where I was trying to benchmark the win from parallelizing the code, and I 
found that the mere fact of adding -pthreads to the compiler flags, without 
actually doing ANY parallelization, was making the code FASTER. (If 
anything, it is expected to run slower, because of additional locks.)

