[Development] Towards a Qt 5 beta

Fri Apr 13 16:41:20 CEST 2012

> Random question of the day: do you happen to have stats about how
> often those insignificant tests actually fail? That should help to
> figure out which ones are actually working, and therefore should not
> be marked as insignificant.

I'm glad you asked.  I spent some time yesterday and today doing that analysis for the 40-odd insignificant tests that had no Jira task associated.

I used the data from the last eleven CI runs for each module (times eight configs for which autotests are executed).  That's not hugely statistically significant, but it's not that easy to pull large numbers of build logs out of the CI system manually through the web interface.  (Any CI folks who can tell me how to extract/search all the stored logs?)

Some of those tests appeared to be passing completely (I have pending commits to remove the insignificant markers for those), some were failing in a stable fashion, and some were partly unstable.  There were also mixtures of the above results for different platforms, e.g. stable results on one platform and unstable results on another.  The information I was able to glean about each test is in the Jira tasks I raised.

The same analysis needs to be done for the 70 or so insignificant tests that already had associated bug reports, but I'm not sure when I'll get the time for that as there are other tasks on my radar too.  Any offers of help are welcome.

When raising Jira tasks, I made recommendations about how to re-enable the portions of those tests that pass reliably (e.g. by marking unstable failures with QSKIP and stable failures with QEXPECT_FAIL).  I didn't find any tests so far where all of the test functions were unstable, meaning that none of those tests is completely worthless.  Most of the tests I looked at had <10% of the test cases failing (either stably or unstably), but the insignificant_test markers stop CI from using the >=90% good test cases in those tests to block any new regressions in the tested classes.

Of course, in the longer term, we should aim to fix the failing and unstable portions of those tests too, but just re-enabling the already-working portions of the insignificant tests would give us a good boost in test coverage.  I'm hopeful that by 5.0.0 (with help from the community) we can eliminate all of the insignificant_test markers in favour of more targetted mechanisms like QSKIP and QEXPECT_FAIL.

--
Jason