[Development] On the reliability of CI

Shawn Rutledge shawn.rutledge at digia.com
Thu Oct 25 11:15:14 CEST 2012

On Thu, Oct 25, 2012 at 01:00:47PM +1000, Rohan McGovern wrote:
> Replying here to some comments on IRC, since I'm rarely online at the
> same time as the others, but I don't want to let all the comments go
> unanswered...
> > steveire> [06:32:44] CI is seriously depresssing. For the last 24 hours
> > there has been one successful merge. Many of the others are failing
> > because of something in network.

Personally I think the fundamental problem which CI could do better is to 
triage problems.  Often patches get staged in large batches, so when the
whole batch fails it's very easy to take a quick look at the failure,
think "that can't possibly be because of what I did", and leave it to 
someone else who presumably understands his own code better.  But maybe
that person also takes a while to realize that his code was really the 

I think when a test fails, the CI system should try to break down the 
patch set in some way.  For example it could divide the patch set in half, 
arbitrarily, and see if half of them will integrate successfully, then
the other half, and continue this recursively until the one bad patch is
found, or at least a smaller subset.

There might be cases when a patch set needs to be tested together though,
because they depend on each other, or because reproducing a bug requires
the whole set of patches.  So maybe we need a "keep patch set together"
flag to disable the auto-triaging.

At previous jobs I've seen various more or less unpleasant social regimes
to prevent "breaking the build", but didn't like any of those, and they
are not amenable to distributed projects anyway.  For example, the 
build master does a build every day, or every Friday, and personally 
nags people if it fails (that was mid-90's, before continuous
integration).  Later with CI, maybe you have to pay a fine when you break 
the build; you have to wear a rubber chicken around your neck for a day; 
or maybe the rubber chicken is used as a token, you can only integrate a 
set of patches if you possess the chicken, you must control which set of 
patches are in the batch (ensure you understand them, or at least 
understand that they are definitely independent), and you cannot pass 
on the chicken to someone else until all the tests pass.  Probably 
most commercial development is done under some such brute-force
regime.  But this is a technical problem, seems like it should have 
a technical solution.  I can only imagine for example that Google 
has a better system for internal development, I just don't know what it is.

More information about the Development mailing list