[Development] QtCS: Qt Quick Performance discussion
Gunnar Sletta
gunnar at sletta.org
Tue Jun 23 10:03:39 CEST 2015
As promised in the talk, I moved the qmlbench tool to a separate repo and gave it a readme: https://github.com/sletta/qmlbench
Still under my github account, but now it is at least documented so others can take part in interpreting the results.
> On 22 Jun 2015, at 14:47, Robin Burchell <robin+qt at viroteck.net> wrote:
>
> It seems that nobody took proper notes of the session (oops!) so I'm
> sending the notes Gunnar and I wrote up in preparation for the session.
> If anyone else has any recollections they'd like to add, please go ahead
> and do so :)
>
> =====
>
> # Performance!
>
> * Why it's important
> * What needs to be checked?
>
> # Benchmarking
>
> * Need support in qmlbench for measuring memory usage.
> * For each set of creation benchmarks, we should run them a second
> time collecting memory information.
> * Once we have that, once graphed across a single test run, memory leaks
> become easy to spot
> * Graphed over time, regressions between commits become easy to spot
> * Likewise for the number of items per frame
>
> # Start up performance
>
> We don't have good benchmarks for this, and creating them isn't trivial
> to measure. Need a few generated examples, maybe.
>
> # Memory usage
>
> * Our memory usage is pretty bad in a lot of areas.
> * Sharing data across processes (fork-booster)..
> * Similar but better sharing achievable through qmlcompiler
> * Benefits could be achieved by carefully allocating all write-once
> expected-to-be-shared data contiguously
> * Dropping CPU-side image data. We recently added QSG_TRANSIENT_IMAGES
> to Qt Quick.
>
> # Item creation performance
>
> * We have pretty good benchmarks for this :)
> * But we (constantly!) keep regressing
> * This week's example: two seperate regressions in QQuickImageBase
> (high DPI, and automatic transform)
> * Image items per frame dropped from ~550/frame to 492/frame -- ~10%
> regression
> * Allocations for 5000 images increased by 41mb
> * This is just one case - it's nobody's "fault", there's just nobody
> taking care of it.
> * Ideally need to do some work on automating runs of it (more on this
> later)
> * What is a good target?
> * 1000 items / frame in qmlbench on a modern desktop / laptop (mbp
> is one such)
> * 100 items / frame on mobile and embedded
>
> # Binding performance
>
> I have no idea whether or not we have good benchmarks for this.
>
> * Probably the creation ones cover the simple cases, but the more
> advanced ones probably need seperate coverage.
> * Ideally we also need to monitor the impact of things changing in
> bindings (& multiple things changing at once, and so on?)
> * Help creating benchmarks welcome!
>
> # Graphics Performance:
>
> Are we good here? I think we're close at least..
> * Clipping has been brought up as an issue, 'simplerenderer' solves that
> * Poor batching gives worse result, 'simplerenderer' solves that also,
> but be mindful that simplerenderer also has ~2-3x worse performance
> overall.
>
> # Recommendations for working on QtQuick
>
> * Avoid structures like QHash unless you are sure they are needed (they
> have a heavy cost, and for <1k items or so, QVector, or std::vector, are
> often a better choice)
> * Avoid signal connections
> * Virtuals instead might be a good choice
> * If you really have to use them, use qmlobject_connect
> * See prior art:
> * qtdeclarative: 0de680c8e8fab36e386dca35e5008ffaa27e8ef6
> * qtdeclarative: 7568922fa240e6e9440e9c6e93bf8ec00c06ec17
> * Memory compactness:
> * Don't introduce padding holes
> * Don't increase the size of frequently allocated things (nodes,
> items) "accidentally" without careful consideration
> * Use a lazily-allocated ExtraData for things that aren't needed
> often
> * Consider your data types carefully - don't use a 64 bit int for an
> "on/off" toggle
> * Consider custom data structures & allocation (page allocation of the
> shadow nodes was a big win)
>
> # Specific Items?
>
> * Rectangle implementation could be improved quite a bit (Gunnar?)
> * Text node improvements (Eskil?)
> * Hash of shadow nodes in the batched render is a large problem
> * https://codereview.qt-project.org/#/c/97708/
> * Delaying compilation (or whatever it is) on inline components until
> they are used. Right now, these can have massive impacts on
> performance/memory unless moved to external files. e.g. "Component {
> Dialog { ... foo ... } }, only used when a button is pressed. It may
> never be pressed.
> * "Don't use" classes, like SpriteSequence :)
> * In general, small items take up huge amounts.. Repeater { model: 500;
> Rectangle { width: 100; height: 100; radius: 10 } } and you have 1Mb or
> something :)
> * QObject -> QQmlData / QQuickItem / QQuickItemPrivate etc all adds up
> on individual items - but what can we do to fix that?
> * The recent introduction of 'padding' in the box model might need a
> second look to make sure it isn't increasing item sizes in common cases.
> 4 extra doubles is quite a large addition to item sizes.
> _______________________________________________
> Development mailing list
> Development at qt-project.org
> http://lists.qt-project.org/mailman/listinfo/development
More information about the Development
mailing list