[Development] Qt CI reliability

Sean Harmer sean.harmer at kdab.com
Tue May 3 15:54:50 CEST 2016


On Monday 02 May 2016 11:14:24 Jędrzej Nowacki wrote:
> On Saturday 30 of April 2016 20:26:20 Sean Harmer wrote:
> > Hi,
> 
> Hi,
> 
> > after yet another 5 hour wait just to be greeted with yet another random
> > failure with no build logs I'm getting really tired of the poor
> > reliability
> > of the Qt CI system.
> 
> I'm sorry about that.
> 
> > https://codereview.qt-project.org/#/c/157590/
> > 
> > has been greeted with genuine failures, failures in qtdeclarative,
> > qtxmlpatterns, multiple random failures in qt3d despite being a simple
> > change which I suspect are due to issues on one or more CI nodes.
> 
> I scanned through the failures and it seems that you had a very bad luck. I
> know CI should not be about luck and therefore I'm really sorry about that.

No need for you to apologize personally, I'm complaining about the policy to 
not have 24x7 support. I don't blame any individual and I know it's still 
improving on the technical front.

> 
> You tried to stage the change 7 times
> 1. Failed to compile qt3d (broken change
> https://codereview.qt-project.org/#/c/157593/)

Yup, such changes are part of normal development, and are expected. These are 
not an issue as long as the CI fails in a timely manner.

What would be a *very* useful feature would be if we can trigger a test build 
of a change on a particular configuration for such cases where we don't have 
ready access to a configuration locally.

> 2. Looks like a network
> connectivity failure, logs were not flushed as they should, so you can
> blame CI
> 3. Blame CI, we failed to acquire a free machine for 5h, I will look at that
> later
> 4. Failed to compile qt3d (broken change
> https://codereview.qt-project.org/#/c/157590/) 5. Failed to compile qt3d
> (broken change https://codereview.qt-project.org/#/c/157590/), same as 4 6.
> Looks like a network connectivity failure, logs were not flushed as they
> should, so you can blame CI, same as 2

Right, any time I find a link that points to nothing for the build logs I'm 
suspicious of the CI.

> 7. Passed
> 
> I will ask IT about network, it seems that network interface was
> re-configured during CI run and DHCP assigned a different IP. It should not
> happen (TM)

Yes that sort of thing should be done in a specified window out of hours after 
disabling the CI master to be able to disseminate jobs to the nodes.

> 
> Rule of thumb is: if logs show broken compilation it means: real problem,
> don't blame CI. There are three main reasons, that I'm aware of, that can
> cause the problem (sorted according to the probability):
> 1. One of changes being integrated broke the compilation

Fine and expected and, with timely failures, not an issue.

> 2. One of module dependencies broke source compatibility

This is very rarely an issue, at least for Qt 3D.

> 3. There was a untested template update (this reason will almost disappear
> soon)

Do you mean VM template? If so then yes that's again something that should 
ideally be verified before deployment.

The other factor that contributes is infrastructure: full disks, network 
outages, reconfigurations etc. These should be monitored for and acted upon and 
where possible, processes changed to avoid these situations.

> *. There was huge radiation in Finland, but that you would know from the
> news ;-)

:)

Anyway, thank you for looking into the issues here.

Cheers,

Sean
-- 
Dr Sean Harmer | sean.harmer at kdab.com | Managing Director UK
KDAB (UK) Ltd, a KDAB Group company
Tel. +44 (0)1625 809908; Sweden (HQ) +46-563-540090
Mobile: +44 (0)7545 140604
KDAB - Qt Experts



More information about the Development mailing list