[Interest] Mouse/Pen input latency in Qt Quick and render loop

Fri Aug 21 03:10:53 CEST 2020

On 19/08/2020 18:57, Daljit Singh wrote:
> - the vsync issue is commonly known in competitive gaming where input 
> lag is quite important, one way to solve this would to limit the 
> framerate of the application to be slightly lower than the display 
> refresh rate (this 
> https://medium.com/@petrakeas/vsync-with-low-input-lag-50ms-lower-2437118bfa5 
> provides more details). However, Qt provides no way to limit the 
> framerate of an app.

A frame rate cap will result in a cyclic latency graph. On *average* the 
latency will be reduced, but individual frame will oscilate between 0 
and 16.7ms of latency over time.

So this actually works for bringing down average latency when playing 
games, but the variable latency effect might not be wanted in a 
professional application.

> - the possibility of dynamically turning off V-Sync at runtime would 
> allow an app to operate in "low-latency" mode when required. Currently 
> this is not possible in Qt, as the QSurfaceFormat::setSwapInterval needs 
> to be called before the QML scene is created.
> 
> - make the threaded renderloop not rely on Vsync throttling so low 
> latency and multithreaded rendering is possible, this was what I wanted 
> to know in my previous reply. There seem to be changes to the render 
> loop in Qt 6, but I am not sure if this has been implemented.

There is another possibility. Keep vsync but add a frame delay. That is, 
if you know your application can finish each frame in less than, say, 
5ms, then you delay the start of each frame loop by 1000/Hz - 5 
milliseconds. For 60Hz that means sleeping for 11.6ms after rendering 
the last frame. The render loop then reads input and renders the next 
frame in the remaining 5ms, giving you a latency of 5ms instead of 16.7.

If you miss the target time though, then that frame will need to wait 
for the next vblank, resulting in a 33.3ms frame time. Or you disable 
vsync for just that frame, which will give you a tear.

However, there's other factors at play here as well. The GPU driver 
usually does not immediately send out each frame immediately to the 
screen when it's presented. Furthermore, since frame preparation (CPU) 
and frame rendering (GPU) are asynchronous in today's graphics APIs, you 
get several prepared frames queued up for rendering. This is the biggest 
contributor to latency today, not the 16.7ms frame latency of 60Hz 
double buffer vsync. That's why disabling vsync in games for example 
doesn't give you a 16.7ms reduction, it gives you more like 50ms.

So to get as low a latency as possible while avoiding the variable 
latency issue of frame rate limiting, you would need to also reduce 
frame queue sizes. In Vulkan you would use a swap chain of only one 
image. OpenGL does not have this (AFAIK), so you'd need to do a hard GPU 
sync on each frame. Both of these decrease rendering parallelism of 
modern GPUs though, so decreasing this kind of buffering also decreases 
performance, increasing minimum GPU speed requirements.