[Development] Container benchmark was HEADS UP: Don't use QList, use Q_DECLARE_TYPEINFO

Sun Jul 12 15:05:51 CEST 2015

Hi Marc,
> Better provide a benchmark into which users can easily plug their own types 
> and that plots relative performance of various containers at various sizes. 
> Then they can use that on their platform, on their type, with their access 
> patterns to determine which container to choose.
> 

I tried to write a container benchmark for testing qt , std and boost containers. I did that already a while ago,
but had not collected enough energy to post about it until now . ;-)
This is not including QList because the benchmark is made to test how fast i can lookup something in a container.

It can be found at https://github.com/gunrot/qtcontainerbench <https://github.com/gunrot/qtcontainerbench>
I got irritating results from it , concerning all the praise i read on the qt mailings list for Vector instead of map or hash.

Running my tests on my  Apple MacBook Pro Retina 13  <http://www.notebookcheck.com/Test-Apple-MacBook-Pro-Retina-13-Zoll-Late-2013.104688.0.html>Inch Notebook, having a 2.4 GHZ Core i5 4258U CPU, using OS X 10.10.4 with Xcode 6.4
and using Qt 5.5.0 Release.

i got these results for a double key and double value using 20 items:
	Type				WalltimeMilliseconds
QMap_find 				0.0001678466796875
QMap_constFind  			0.0001869201660156
QHash_find				0.0003662109375
QHash_constFind			0.0002479553222656
QVector_lowerbound		0.0001735687255859
stdvector_lowerbound		0.0001583099365234
stdmap_find				0.0001544952392578
stdunordered_find			0.0002250671386719
qfasthash_find				0.0005035400390625
boostunordered_find		0.0002021789550781

QFastHash is a class implemented by you , which i found at https://codereview.qt-project.org/#/c/105743/ <https://codereview.qt-project.org/#/c/105743/>

So the fastest here is std::map, with std::vector having nearly the same speed.
the slowest by more than 3 times than the fastest is QFastHash.

What is irritating that QMap_constFind is slower than QMap_find. 
The other irritating thing is that Qt classes lose here all the way. Even QVector is losing against std::vector and i used std:lowerbound in both cases 
( qlowerbound was also slower than std::lowerbound when i started making this benchmark)

For 5 Items the test shows these results:
	Type				WalltimeMilliseconds
QMap_find 				3.480911254883e-05
QMap_constFind  			3.910064697266e-05
QHash_find				9.1552734375e-05
QHash_constFind			7.057189941406e-05,
QVector_lowerbound		3.671646118164e-05
stdvector_lowerbound		3.385543823242e-05
stdmap_find				3.576278686523e-05
stdunordered_find			5.722045898438e-05
qfasthash_find				13.16070556641e-05
boostunordered_find		4.863739013672e-05

So for the Qmap anomaly is also apparent, but the winner is now std::vector followed closely by QMap::find and std::map. No reason to prefer it because it has not a actual find api and you need to keep  it sorted etc.

The clear looser by far is QFastHash again.

So lets now have a look at 100 items:

	Type				WalltimeMilliseconds
QMap_find 				0.001144409179688
QMap_constFind  			0.00128173828125
QHash_find				0.001617431640625
QHash_constFind			0.0013427734375
QVector_lowerbound		0.001495361328125
stdvector_lowerbound		0.0010986328125
stdmap_find				0.001083374023438
stdunordered_find			0.001129150390625
qfasthash_find				0.002716064453125
boostunordered_find		0.001007080078125

So here i see that now boost unordered is the fastest, closely follow by the former winner std::map and std::vector, Qmap and std::unorderd is nearby.
QFastHash gain is far worse than anything else. What was the exact reason for this class , Marc?
I also don’t see an advantage in using a vector.

Well, maybe you noticed i used a double as key. So what result do we expect, when we use an int64 as key and as value. I naively thought they should not be much different.
So lets have a look a the results for 20 items using a int64_t as key and value:

	Type				WalltimeMilliseconds
QMap_find 				0.0002021789550781
QMap_constFind  			0.0001430511474609
QHash_find				0.0001869201660156
QHash_constFind			0.0001411437988281
QVector_lowerbound		0,0001659393310547
stdvector_lowerbound		0.000152587890625
stdmap_find				0.0001506805419922
stdunordered_find			0.0002021789550781
qfasthash_find				0.0003166198730469
boostunordered_find		0.0002021789550781

Well this is a change. Now at least constFind it faster than find.
As before QFastHAsh is the slowest one, but QHash ::constFind is king now, but QMap constfind and std::map and std::vector are not that far away.	

So now look at the speed with 5 items:

	Type				WalltimeMilliseconds
QMap_find 				3.957748413086e-05
QMap_constFind  			3.862380981445e-05
QHash_find				4.673004150391e-05
QHash_constFind			4.005432128906e-05
QVector_lowerbound		3.528594970703e-05
stdvector_lowerbound		3.147125244141e-05
stdmap_find				3.862380981445e-05
stdunordered_find			5.531311035156e-05
qfasthash_find				8.678436279297e-05
boostunordered_find		5.531311035156e-05

std::vector wins here closely followed byQVector and QMap::constFind, but do you think the difference is worth choosing vector over map/qhash?
And with 20 items vector falls behind. ( Actually my tests show that vector  with 10 items falls behind map/qhash already)

And for 100 items we have this results:

	Type				WalltimeMilliseconds
QMap_find 				0.001144409179688
QMap_constFind  			0.00128173828125
QHash_find				0.001617431640625
QHash_constFind			0.0013427734375
QVector_lowerbound		0.001495361328125
stdvector_lowerbound		0.0010986328125
stdmap_find				0.001083374023438
stdunordered_find			0.001129150390625
qfasthash_find				0.002716064453125
boostunordered_find		0.001007080078125

Hmm QMap constFind is again slower than find. The winner here is by a tiny amount boost::unordered, on par are std::map and std::vector.

So from this results , i would choose std::map for a container of about 100 items where i need fast lookup. For 1000 and more , QHash is getting faster and faster but not for double as key, as suppose thats because the qHash() for double is not implemented well, as std::unordered and boost::unordered do a lot better job here.

But doing all these measurements with an int key and an int value changes the picture.

5       items QHash_constFind  3,433227539062e-05  stdmap_find	3,194808959961e-05 stdvector_lowerbound 3,147125244141e-05
7   items QHash_constFind  4,863739013672e-05  stdmap_find	5,197525024414e-05 stdvector_lowerbound 4,196166992188e-05
10  items QHash_constFind  6,675720214844e-05  stdmap_find	6,961822509766e-05 stdvector_lowerbound 6,866455078125e-05
20  items QHash_constFind  0.0001335144042969  stdmap_find	0.0001544952392578 stdvector_lowerbound 0.000152587890625
100 items QHash_constFind  0.0006332397460938  stdmap_find	0.001113891601562  stdvector_lowerbound 0.001068115234375

So for a simple int as key QHash is faster starting from 7 Items and gains speed by increasing items, being considerably faster at 100 items.

So what I learn here that there is:
1. is a hight dependency on the key type for a container being slower or faster in lookup time. 
2. Something with fast in its name can actually be lowest one by far.

> If you want to update the docs, as a first approximation, write that QVector 
> is _the_ default seqential (and, if it wasn't so awkward to use, also _the_ 
> default associative) container (even though it's not currently reflected in 
> the Qt API). All other containers (incl. QMap, QSet, ..., but maybe not QHash) 
> should only be used after very careful profiling and QList should not be used 
> by mere mortals at all.
> 

So what does my benchmark measure wrong? Because if you are right in saying QVector should be the default associative container, then my benchmark must be wrong.

Best regards,
Gunnar Roth

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20150712/88cebfd2/attachment.html>