[Interest] Qt5 performance on imx6 with full hd

Gunnar Sletta gunnar.sletta at jolla.com
Tue May 27 10:36:52 CEST 2014


On 27 May 2014, at 10:28, Jacob Kroon <jacob.kroon at gmail.com> wrote:

> Hi Gunnar,
> 
> On Fri, May 23, 2014 at 11:39 AM, Gunnar Sletta <gunnar.sletta at jolla.com> wrote:
> 
> On 23 May 2014, at 10:32, Jacob Kroon <jacob.kroon at gmail.com> wrote:
> 
> > Hi,
> >
> > I'm experimenting with a Qt application running on the Wandboards,
> > at full hd resolution, 1920x1080x32. I have a static background image, and some small animated Qml-elements on the screen. I'm not entirely satisfied with the resulting performance, and I think it is because of the background image being constantly fully redrawn in each frame.
> 
> If it cannot render a single image at that resolution, then it is not a suitable hardware for that resolution :)
> 
> If this is the case your options are to reduce the bit depth, reduce the resolution or reduce the framerate (such as going for 30FPS instead of 60FPS)
> 
> Is the image alpha blended? If it is a JPEG it should be opaque, but if it is a PNG it will most likely have an alpha channel (even thought it really doesn't. I know, stupid, but that how it is). For some GPUs blending can add a bit of overhead, and doing it fullscreen can be what tips the balance.
> 
> 
> For the background image, what I did was that I used a tiled noise image, and did a radial blend on it. The resulting image was cached though. But yes, the resulting image showed up as "alpha" in the renderer debug output, I've switched to having a static prepared png image, and now the background image shows up as "opaque".
> 
> > I've experimented with
> >
> > * setClearBeforeRendering(false), since the background will be drawn anyway, there is no point in clearing before rendering. This seemed to have little impact on performance though.
> 
> The effect of turning off clearing is highly hardware dependent. Some drivers/GPUs will benefit from not clearing as the clear is just yet another pass over all pixels. Others will use the clear as an indication of "new frame" and will have to do all manner of nasty stuff, like storing the depth buffer into system memory because you didn't clear it before the frame began.
> 
> See the performance guidelines of your GPU for the actual recommendation for your chip.
> 
>  
> It seems that setting "QSG_ANTIALIASING_METHOD=msaa" gives a huge improvement on the cpu usage.

That is a bit interesting... What are you drawing on top of this image? msaa based antialiasing generally adds a bit to the cost on the GPU, but it makes the work a lot simpler for the renderer when you have Rectangle and Image elements with antialiasing: true in QML. 

Is there a lot of overdrawing?

> Also, playing with QSG_RENDERER_BATCH_VERTEX_THRESHOLD and QSG_RENDERER_BATCH_NODE_THRESHOLD,
> allows me to lower the cpu usage even further. Still, the background image is taking a considerable time it seems. With bg image I get ~27 fps,
> without it I get ~85 fps (with vsync turned off)

How is the performance if you use an image which is a lot smaller? Since the big image causes such a big impact, maybe texture loading is a bottleneck. If this is the problem, then using a ShaderEffect to create the tiled-noise background might be an option as that would then not suffer from massive texture fetching.

> 
> >
> > * Letting Qt5 render into /dev/fb1 overlay on the imx6, with no background image, but instead write the background image manually into /dev/fb0. In this way, the IPU will blend the result onto the display. This seemed to be even worse than letting the GPU render the background.
> >
> > Can the scenegraph be smart enough in such a way that it will only "clear" dirty rectangles with a user supplied background image ? Or are there any other tricks I am not aware of ?
> 
> The default scene graph renderer renders the full screen. Doing partial updates requires a lot of support from the underlying drivers and hardware. If you really want to, you can give it a shot of course. You can copy the existing renderer, adapt it to expose dirty areas and make use of partial swap buffers extension if this is available (and the underlying display stack actually does propagate the sub regions all the way to the display).
> 
> 
> Writing a custom renderer would have to be a final resort .. 
> 
> Thanks,
> Jacob




More information about the Interest mailing list