[Interest] Slow graphics effect (due to _dl_resolve_runtime_avx)

Thiago Macieira thiago.macieira at intel.com
Sat Nov 26 19:59:13 CET 2016


On domingo, 27 de novembro de 2016 00:08:53 PST Ch'Gans wrote:
> > That explains why _dl_runtime_resolve_avx was used, as opposed to the
> > non-AVX version of it.
> > 
> > Note that I can't find that function in glibc's source code. If it really
> > is a problem, then your distribution caused it by applying a patch and
> > you should complain to them. But I don't think that this the problem.
> > Stop looking at _dl_runtime_resolve_avx, that's a red herring. Your
> > problem is elsewhere.
> It's in linux-ld.so. Anyway I gave up about the AVX2, i think you are
> right, it was just a red herring.

BTW, I've found it in it glibc source code meanwhile. I hate their 
organisation, it was not where I expected (though in hindsight I should have 
expected it). It's in sysdeps/, not elf/, and the assembly code is in the .h, 
not in the .S file...

> But I gave another go at running valgrind, this time i started with
> event logging off, then started the test app, start zooming until the
> app slows down, then switched event logging on, zommed in a few times,
> switched logging off and closed the test app.
> 
> I think i have now better data to analise.
> 
> If i sort the data by "Incl. Cost: Ir", then the top one is
> QEventLoop::exec, called 1.2E9 times, then lot of event processing

"Ir" is not call count, it's "instruction read", or more simply the number of 
instructions executed. That includes "nop" instructions, so it differs from 
the processor's own accounting of retired instructions. And it's not the count 
of cycles, since callgrind does not model how many cycles an instruction 
takes.

> related functions, then the first event handler:
> QGraphicsView::paintEvent, then scene drawItem, quickly follow by the
> drop shadow effect draw(), the drop shadow filter draw(),
> qt_blurImage() followed by expblur<>(), after that the cost drop
> 
> drastically, so to resume roughly:
> | Cost              , function
> 
>   1,212,124,165, QEventLoop::exec()

This line means "the problem happens while your application is executing its 
event loop" or the same as I had said before: "happens inside main".

>   1,169,460,682, QGraphicsView::paintEvent()
> ...
>   1,123,576,248, QPixmapDropShadowFilter::draw()
>   0,907,246,806, qt_blurImage()
>   0,861,710,958, expblur<>()
>   0,181,757,861, fillRect_normalized()
> ...

These are a lot more interesting. As expected, it happens inside the paint 
event, whose time is 96% taken by the drop shadow filter. Note that the 
relative time is not per se a problem: *something* has to take time. But since 
we know it's slow, that is a good target for the culprit.

> What i'm trying to say here, is that almost all the events seem to be
> targeted at repainting the drop shadow, and I can tell you that the
> test app ran for only a couple of seconds during which i was
> wheel-zooming, and the GraphicsView had visually changed only 4 or 5
> times.
>
> So something tells me that, for some reasons, the drop shadow is
> repainted way too many times, yet the app on the screen is updated
> only once every second or so.

Your interpretation is incorrect because it's based on the wrong assumption 
that the number is call count. It's not.

The number is more closely related to time spent, so the correct 
interpretation is that most of the application's time is dedicated to 
repainting the drop shadow.

We don't know why that is. The instruction read count is the product of the 
cost of each instruction every time it is called by the number of times it is 
called. So it's still possible we have a call count problem, but your evidence 
doesn't show that or disprove it. It's possible that on your system, the drop 
shadow needs to be painted way more often than on another system.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center




More information about the Interest mailing list