[Development] QtCS: Qt Quick Performance discussion

Robin Burchell robin+qt at viroteck.net
Mon Jun 22 14:47:53 CEST 2015


It seems that nobody took proper notes of the session (oops!) so I'm
sending the notes Gunnar and I wrote up in preparation for the session.
If anyone else has any recollections they'd like to add, please go ahead
and do so :)

=====

# Performance!

* Why it's important
* What needs to be checked?

# Benchmarking

* Need support in qmlbench for measuring memory usage.
    * For each set of creation benchmarks, we should run them a second
    time collecting memory information.
* Once we have that, once graphed across a single test run, memory leaks
become easy to spot
    * Graphed over time, regressions between commits become easy to spot
* Likewise for the number of items per frame

# Start up performance

We don't have good benchmarks for this, and creating them isn't trivial
to measure. Need a few generated examples, maybe.

# Memory usage

* Our memory usage is pretty bad in a lot of areas.
* Sharing data across processes (fork-booster)..
    * Similar but better sharing achievable through qmlcompiler
* Benefits could be achieved by carefully allocating all write-once
expected-to-be-shared data contiguously
* Dropping CPU-side image data. We recently added QSG_TRANSIENT_IMAGES
to Qt Quick.

# Item creation performance

* We have pretty good benchmarks for this :)
* But we (constantly!) keep regressing
    * This week's example: two seperate regressions in QQuickImageBase
    (high DPI, and automatic transform)
    * Image items per frame dropped from ~550/frame to 492/frame -- ~10%
    regression
    * Allocations for 5000 images increased by 41mb
    * This is just one case - it's nobody's "fault", there's just nobody
    taking care of it.
* Ideally need to do some work on automating runs of it (more on this
later)
* What is a good target?
    * 1000 items / frame in qmlbench on a modern desktop / laptop (mbp
    is one such)
    * 100 items / frame on mobile and embedded

# Binding performance

I have no idea whether or not we have good benchmarks for this.

* Probably the creation ones cover the simple cases, but the more
advanced ones probably need seperate coverage.
* Ideally we also need to monitor the impact of things changing in
bindings (& multiple things changing at once, and so on?)
* Help creating benchmarks welcome!

# Graphics Performance:

Are we good here? I think we're close at least.. 
* Clipping has been brought up as an issue, 'simplerenderer' solves that
* Poor batching gives worse result, 'simplerenderer' solves that also,
but be mindful that simplerenderer also has ~2-3x worse performance
overall.

# Recommendations for working on QtQuick

* Avoid structures like QHash unless you are sure they are needed (they
have a heavy cost, and for <1k items or so, QVector, or std::vector, are
often a better choice)
* Avoid signal connections
    * Virtuals instead might be a good choice
    * If you really have to use them, use qmlobject_connect
    * See prior art:
       * qtdeclarative: 0de680c8e8fab36e386dca35e5008ffaa27e8ef6
       * qtdeclarative: 7568922fa240e6e9440e9c6e93bf8ec00c06ec17
* Memory compactness:
    * Don't introduce padding holes
    * Don't increase the size of frequently allocated things (nodes,
    items) "accidentally" without careful consideration
    * Use a lazily-allocated ExtraData for things that aren't needed
    often
    * Consider your data types carefully - don't use a 64 bit int for an
    "on/off" toggle
* Consider custom data structures & allocation (page allocation of the
shadow nodes was a big win)

# Specific Items?

* Rectangle implementation could be improved quite a bit (Gunnar?)
* Text node improvements (Eskil?)
* Hash of shadow nodes in the batched render is a large problem
    * https://codereview.qt-project.org/#/c/97708/
* Delaying compilation (or whatever it is) on inline components until
they are used. Right now, these can have massive impacts on
performance/memory unless moved to external files. e.g. "Component {
Dialog { ... foo ... } }, only used when a button is pressed. It may
never be pressed.
* "Don't use" classes, like SpriteSequence :)
* In general, small items take up huge amounts.. Repeater { model: 500;
Rectangle { width: 100; height: 100; radius: 10 } } and you have 1Mb or
something :)
* QObject -> QQmlData / QQuickItem / QQuickItemPrivate etc all adds up
on individual items - but what can we do to fix that?
* The recent introduction of 'padding' in the box model might need a
second look to make sure it isn't increasing item sizes in common cases.
4 extra doubles is quite a large addition to item sizes.



More information about the Development mailing list