[Development] Usage of forcesuccess and qt.tests.insignificant in CI branches

Mon Dec 17 13:17:55 CET 2012

Hi,

I had some concerns related to change that was introduced lately to
CI testconfigs:

    https://codereview.qt-project.org/42954 
    https://codereview.qt-project.org/42955 

In short the question is whether we should have forcesuccess and 
qt.tests.insignificant configured on CI branch basis (dev, stable, release)
or on CI repo basis?

Sergio wants them on branch basis, and I can understand reasons/rationales
he made. As far as I see the main benefit in this approach is that it 
is easy to remove these properties branch by branch, and make sure CI in 
any branch won't be blocked due to "incorrect" testconfig configuration. 
An example about "incorrect" testconfig would be something like: 

tst_bad is fixed in release branch and CI config where it was broken was
made enforcing for whole repository. Now if that fix is not yet taken into
stable or dev branch, the CI in those branches would fail.

I completely understand this and rationales that we don't want to block 
"good change-sets" integrating to dev/stable before fix from release branch
is propagated to stable and dev. However, from QA point of view I see this 
also as a main problem of branch based CI properties. 

Let me explain my QA viewpoint:

1. One reason for introducing branches was to make Qt stabilization easier:

For me this also means that bug fixes (and thus also fixes to broken 
Auto tests), should be *primarily* introduced in most stable branch i.e. 
release. Of course there can be situations when fix can be made let's say 
only in dev due to commit policy, but that should be quite rare situation.

2. If something is working in stable, I see no reason why it would
be allowed to fail in release branch.

Due to the argument that we want to stabilize Qt for release, I would not
like to allow different, especially less strict configuration in release 
branch CI. 

But from QA point of view I think  that this should be true also to other 
direction i.e. release -> stable. Why? If we allow less strict CI in stable 
branch compared to release branch it can happen that some bug is fixed in 
release branch and before it reaches stable branch, a new regression is 
introduced which makes recently fixed auto test to fail again in stable
and dev. This means we need to fix the problem again for next release. 
To avoid this I would like to have such CI policy that if something
has been made enforcing in release, it is enforcing also in stable and 
preferably also in dev. This means that once something is fixed in release,
someone needs to merge the fix also to stable and dev or otherwise CI is
likely to block new changes since fixed test is failing due enforcing CI 
config. If some test still fails in stable/dev branch after merge, regression
compared to release branch has been introduced and it need to be fixed.

I understand that this kind of CI policy will cause some unnecessary 
CI failures but on the other hand I see it as an only way to increase Qt
quality in long run.

3. Something is fixed only dev branch due to commit policy

In this case I'm ready to make exception and allow more strict CI policy
only for dev branch. 

Summary:

I would like to have forcesuccess and qt.tests.insignificant properties
identical for all branches on CI repo level, with the cost of possibly 
delayed integration of "good change-set" to dev and stable branches,
and with the benefit that no auto test covered regressions will exist
in subsequent releases. Now someone will probably suggest that
we don't need to make development in dev and stable branches harder
with this kind of CI policy since we can fix the regressions introduced
during development of dev and stable branch when we are making 
release (CI config for release branch is not made less strict between 
releases). While  this would work in theory, I don't believe it is will 
work in practice. Bugs should be detected always as early as possible.

Comments?

PS: There is already 126 'forcesuccess' flags in CI properties 
(meaning 126 CI stages in all CI projects are essentially meaningless). 
And there is 157 qt.tests.insignificant flags in, meaning that auto tests 
are essentially meaningless in 157 CI stages. Purpose of this mail was 
to suggest some concrete way for reducing these numbers in long run.

--
Janne Anttila
Senior Architect - Digia, Qt
Visit us on: http://qt.digia.com