[Interest] Mouse/Pen input latency in Qt Quick and render loop

Fri Aug 21 13:48:50 CEST 2020

There is another possibility. Keep vsync but add a frame delay. That is,
if you know your application can finish each frame in less than, say,
5ms, then you delay the start of each frame loop by 1000/Hz - 5
milliseconds. For 60Hz that means sleeping for 11.6ms after rendering
the last frame. The render loop then reads input and renders the next
frame in the remaining 5ms, giving you a latency of 5ms instead of 16.7.

Thanks Nikos for your reply. I am quite new to the world of computer graphics so your thoughts were really helpful. Following your ideas, I was noticeably able to improve the latency in my application! The average render time for each frame (when not scrolling) is about 1-2 ms (I rudimentally tested this using QElapsedTimer in debug mode), since I cache what has previously been drawn in an FBO. What I did then is schedule a timer with an interval of 15ms (my screen refresh rate is 60hz) that get triggered after calling QQuickItem::update() and when the new input events from the pen come in I put them in a queue until the timer has stopped running (so effective calling the next update() slot at t > 15ms). This little hack helped quite a bit and it's very convenient since I am able to turn it off at any point I need, so when the latency is not critical (e.g. when scrolling my canvas) I switch it off.
I made some rudimental test videos using my phone (60 fps 0.5 speed) which shows the improvements: Normal latency<https://imgur.com/a/LZJC1xt> vs Improved latency<https://imgur.com/c4YcdpP>. Of course, disabling vsync still gives better results.

Just as an extra data point: have you tried driving the Qt Quick render
loop manually, using QQuickRenderControl?
Yes Giuseppe, you're right I forgot to mention that option. I am not too familiar with this class so I need to study it before attempting this. However, I would like to avoid this since it would essentially involve creating your own threaded renderer and it would require significant time to build and maintain a solution that has feature parity with the default solution provided by Qt (although from the example in the documentation it doesn't seem to be too bad).

________________________________
From: Interest <interest-bounces at qt-project.org> on behalf of Nikos Chantziaras <realnc at gmail.com>
Sent: 21 August 2020 03:10
To: interest at qt-project.org <interest at qt-project.org>
Subject: Re: [Interest] Mouse/Pen input latency in Qt Quick and render loop

On 19/08/2020 18:57, Daljit Singh wrote:
> - the vsync issue is commonly known in competitive gaming where input
> lag is quite important, one way to solve this would to limit the
> framerate of the application to be slightly lower than the display
> refresh rate (this
> https://medium.com/@petrakeas/vsync-with-low-input-lag-50ms-lower-2437118bfa5
> provides more details). However, Qt provides no way to limit the
> framerate of an app.

A frame rate cap will result in a cyclic latency graph. On *average* the
latency will be reduced, but individual frame will oscilate between 0
and 16.7ms of latency over time.

So this actually works for bringing down average latency when playing
games, but the variable latency effect might not be wanted in a
professional application.

> - the possibility of dynamically turning off V-Sync at runtime would
> allow an app to operate in "low-latency" mode when required. Currently
> this is not possible in Qt, as the QSurfaceFormat::setSwapInterval needs
> to be called before the QML scene is created.
>
> - make the threaded renderloop not rely on Vsync throttling so low
> latency and multithreaded rendering is possible, this was what I wanted
> to know in my previous reply. There seem to be changes to the render
> loop in Qt 6, but I am not sure if this has been implemented.

There is another possibility. Keep vsync but add a frame delay. That is,
if you know your application can finish each frame in less than, say,
5ms, then you delay the start of each frame loop by 1000/Hz - 5
milliseconds. For 60Hz that means sleeping for 11.6ms after rendering
the last frame. The render loop then reads input and renders the next
frame in the remaining 5ms, giving you a latency of 5ms instead of 16.7.

If you miss the target time though, then that frame will need to wait
for the next vblank, resulting in a 33.3ms frame time. Or you disable
vsync for just that frame, which will give you a tear.

However, there's other factors at play here as well. The GPU driver
usually does not immediately send out each frame immediately to the
screen when it's presented. Furthermore, since frame preparation (CPU)
and frame rendering (GPU) are asynchronous in today's graphics APIs, you
get several prepared frames queued up for rendering. This is the biggest
contributor to latency today, not the 16.7ms frame latency of 60Hz
double buffer vsync. That's why disabling vsync in games for example
doesn't give you a 16.7ms reduction, it gives you more like 50ms.

So to get as low a latency as possible while avoiding the variable
latency issue of frame rate limiting, you would need to also reduce
frame queue sizes. In Vulkan you would use a swap chain of only one
image. OpenGL does not have this (AFAIK), so you'd need to do a hard GPU
sync on each frame. Both of these decrease rendering parallelism of
modern GPUs though, so decreasing this kind of buffering also decreases
performance, increasing minimum GPU speed requirements.

_______________________________________________
Interest mailing list
Interest at qt-project.org
https://lists.qt-project.org/listinfo/interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/interest/attachments/20200821/aefba5fe/attachment.html>