[Development] On the reliability of CI
Simon Hausmann
simon.hausmann at digia.com
Thu Oct 25 09:57:14 CEST 2012
On Thursday, October 25, 2012 02:32:49 PM Lincoln Ramsay wrote:
> On 25/10/12 13:00, Rohan McGovern wrote:
> > True, there used to be Nokia employees reading every failure report and
> > chasing up apparently unstable tests, either trying to fix the tests, or
> > acknowledge them via bug reports and marking them insignificant.
> > Those people are gone and the test results are likely to be less stable
> > until they're replaced
>
> This.
>
> The QA guys in Brisbane did an awesome job that was perhaps not so
> obvious or visible to people outside of the office. Not only did they
> keep the CI system running and stable, they poked, prodded and tweaked
> the Qt product so that it could pass through the CI system quickly
> (raising bugs as appropriate when tests were broken or flaky).
+1
I'd like to express my support for what Linclon and Rohan said and also stress
that Rohan (as well as Sergio, Janne, etc.) is/are doing an absolutely
outstanding job in helping to keep things going (despite management
directives! (which I hope are a thing of the past)).
I've had the "pleasure" of helping with the nursing qt5.git integrations in
the past weeks and I've found that most of the time failing integrations are
the result of bugs in our code. Sometimes sloppyness, sometimes hard to find
bugs.
I invite everyone in the community who is annoyed with the CI system to help
improving it. It doesn't require any special network access (trust me, I don't
have such access right now). Rohan has done a great job in making sure that as
much information as possible is publically accessible, including extensive
build logs and the source code of the scripts that power the entire system.
> I'm pretty sure there's someone at Digia ready to take over maintenance
> of the CI system. However, there isn't (to my knowledge) anyone ready to
> take on the task of keeping Qt in a state that can pass through the CI
> system. If nobody steps up to take on this responsibility then it'll
> fall on everyone to ensure their stuff is getting through CI.
One approach we could consider is what's called "Gardening" in WebKit:
Introduce a roster with people on duty who can help to nurse things, help push
things through the CI system. It's something anyone could help with,
regardless of their employment.
Perhaps in an ideal world something like that wouldn't be needed. But I have
the strong feeling that even if we had a super fast CI system that allowed
build and auto-testing the majority of individual commits separately within
say 10 minutes, even then I believe we'd still have a fair amount of those
subtle issues where an innocent change in one module makes a test in another
seemingly unrelated module flakey, breaking say a qt5.git integration. I'm
afraid that with a code base of the size we're presented with we have to
accept a certain amount of this.
However that should in no way stop us from investing time in improving the CI
system as is, i.e. trying to make it faster and more reliable. I hope that the
transition to Jenkins will make it easier to develop the system itself from
the "outside", i.e. add experimental nodes to try out new approaches of
incremental builds, etc.
It might even be a fun "hackathon" for the next Qt contributor's summit.
Simon
More information about the Development
mailing list