[Development] Container benchmark was HEADS UP: Don't use QList, use Q_DECLARE_TYPEINFO

Sun Jul 12 16:57:39 CEST 2015

Hi Milian, hi Marc

> Am 12.07.2015 um 16:06 schrieb Milian Wolff <milian.wolff at kdab.com>:
> 
> On Sunday, July 12, 2015 04:58:01 PM Marc Mutz wrote:
>> On Sunday 12 July 2015 15:05:51 Gunnar Roth wrote:
>>> Hi Marc,
>>> 
>>>> Better provide a benchmark into which users can easily plug their own
>>>> types and that plots relative performance of various containers at
>>>> various sizes. Then they can use that on their platform, on their type,
>>>> with their access patterns to determine which container to choose.
>>> 
>>> I tried to write a container benchmark for testing qt , std and boost
>>> containers. I did that already a while ago, but had not collected enough
>>> energy to post about it until now . ;-) This is not including QList
>>> because the benchmark is made to test how fast i can lookup something in a
>>> container.
>>> 
>>> It can be found at https://github.com/gunrot/qtcontainerbench
>>> <https://github.com/gunrot/qtcontainerbench> I got irritating results from
>>> it , concerning all the praise i read on the qt mailings list for Vector
>>> instead of map or hash.
>> 
>> That's because your benchmark runs entirely in L1.
>> 
>> If you want to test containers of 10, 100, 1000 and 10000 elements each,
>> then make
>> - one run over 1×N containers of 10000 elements each,
>> - one run over 10×N containers of 1000 elements each,
>> - one run over 100×N containers of 100 elements each,
>> - one run over 1000×N containers of 10 elements each.
>> 
>> You need to use a working set size large enough to not hit L1 all the time,
>> because it's just not the usual case that a lookup table is completely in
>> L1. But you can start with a small working set. As you increase the working
>> set size, you should see how each individual test (at constant number of
>> elements) increases in runtime. And you will see that the contiguous
>> containers don't loose as much as the node-based.
> 
Ok i understand your concern about the influence of the L1 cache,the size of my L1 cache is 64 kb, but shouldn’t continuoscontainer like vector benefit even more from that than non contagious like map or hash?

the test of  1 container of 10000 items i can easily produce. I added a std::vector push_back and sort testcase.

The result is:
"QMap_insert          -- 10000       ","WalltimeMilliseconds",0.625,80,128
"QHash_insert         -- 10000       ","WalltimeMilliseconds",0.099609375,51,512
"stdmap_insert        -- 10000       ","WalltimeMilliseconds",0.4921875,63,128
"stdvector_pb_sort    -- 10000       ","WalltimeMilliseconds",2.4375,78,32
"stdunordered_insert  -- 10000       ","WalltimeMilliseconds",0.2109375,54,256
"qfasthash_insert     -- 10000       ","WalltimeMilliseconds",0.14453125,74,512
"boostunordered_inser -- 10000       ","WalltimeMilliseconds",0.140625,72,512
"QMap_find            -- 10000       ","WalltimeMilliseconds",0.65625,84,128
"QMap_constFind       -- 10000       ","WalltimeMilliseconds",0.5546875,71,128
"QHash_find           -- 10000       ","WalltimeMilliseconds",0.0830078125,85,1024
"QHash_constFind      -- 10000       ","WalltimeMilliseconds",0.05859375,60,1024
"QVector_lowerbound   -- 10000       ","WalltimeMilliseconds",0.5078125,65,128
"stdvector_lowerbound -- 10000       ","WalltimeMilliseconds",0.484375,62,128
"stdmap_find          -- 10000       ","WalltimeMilliseconds",0.5546875,71,128
"stdunordered_find    -- 10000       ","WalltimeMilliseconds",0.111328125,57,512
"qfasthash_find       -- 10000       ","WalltimeMilliseconds",0.162109375,83,512
"boostunordered_find  -- 10000       ","WalltimeMilliseconds",0.103515625,53,512

so here inserting 10000 elements QHash is the fastest and std::vector the slowest. Its by a factor of 25 slower

for looking up items QHash constFind is the fastest , being faster by at least a factor of 2, against std:vector is wins by a factor of 5.

> <snip>
> 
> In addition to that, you should additionally benchmark the setup cost for the 
> containers. I have often seen code that was slow, not because finding items in 
> it was slow, but rather because the lists where temporary and thus the setup 
> cost never amortized. Then, the cost of malloc & free is significant, often 
> orders of magnitude more expensive than actually finding the elements. There, 
> QVarLengthArray (if possible) or at least QVector/std::vector _really_ shines.
> 

Hmm std::vector at least does not shine here

Best regards,
Gunnar  Roth

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20150712/2cc03678/attachment.html>