[Development] Proposal: New branch model

Mon Jan 28 11:11:58 CET 2019

Am 25.01.2019 um 11:08 schrieb Lars Knoll:
>> The CI problem comes from the fact that if we have a high rate of
>> stages to qtbase/dev, we at some point get into a deadlock situation,
>> even if we disregard any flakiness in the system. That’s because
>> higher rates imply that more changes are tested together. This in
>> turn increasing the risk of rejection of all changes because of one
>> bad change in the set. So the current system doesn’t scale and
>> basically rate limits the amount of changes going into a branch (or
>> worse, we end up getting traffic jams where the actual rate actually
>> goes down to zero).

My working guess at what the present system does is that it piles up a
staging branch with everything that gets staged while an integration is
running; when one integration completes (with maybe some modest delay if
the staging branch is sort), the staging branch gets its turn to attempt
to integrate (possibly via a rebase onto a freshly-integrated branch).
I hope someone who knows the actual process can describe it in this
thread.

If that's reasonably close to true, the we shall indeed get many commits
piling up in each staging branch, increasing the likelihood of failure
in the integration attempt.  We could mitigate that in various ways by
tweaking the process.

In particular, we could cap the length of staging branches (perhaps with
a bit of flexibility to let in commits with the same owner as some
already in the branch, so that groups of related changes don't get split
up); once a staging branch hits the cap, we start a fresh staging
branch.  This gives us a queue of staging branches, rather than just
one, each waiting to be integrated.

>> To me this means we need to seriously rethink that part of our CI
>> system, and ideally test changes (or patch series) individually and
>> in parallel. So maybe we should adjust our CI system that way:
>>
>> * test changes (or patch series) individually
>> * If they pass CI merge or cherry-pick them into the branch
>> * reject only if the merge/cherry-pick gives conflicts

We could equally run several integrations in parallel and select one of
those that succeed (probably the one that entered the staging queue
earliest) to be the new tip; all others that succeed, plus any new
staging branches grown in the interval, get rebased onto that and tested
in parallel again.  That'll be "wasteful" of Coin resources in so far as
some branches pass repeatedly before getting accepted, but it'll avoid
the small risk you describe below.  The "speculative" integrations being
run in parallel with the "will win if several succeed" one make the most
of Coin having the capacity to run several in parallel - assuming it does.

>> This adds a very small risk that two parallel changes don’t conflict
>> during the merge/cherry-pick process, but cause a test regression
>> together. To help with that, we can simply run a regular status check
>> on the repo. If this happens the repo will be blocked for further
>> testing until someone submits a fix or reverts an offending change,
>> which is acceptable.

When those events happen, they're going to gum up the whole works until
they're sorted out; they require manual intervention, Which Is Bad.

Robert Loehning (25 January 2019 17:49) replied
> Could that be solved by testing the combinations of changes again?
>
> * test changes (or patch series) individually
> * If they pass CI merge or cherry-pick them into some local branch
> * reject if the merge/cherry-pick gives conflicts
> * when time period x has passed or the local branch contains y changes,
> test the local branch
>    good: push the local branch to the public one
>    bad:  repeat step four with a subset of the changes it had before
>
> Assuming that y is significantly greater than 1, the added overhead for
> one more test run seems relatively small to me.

IIUC, you're describing a two-stage integration process; test many
staging branches in parallel; accumulate successes; combine those and
re-test; if good, keep.  There shall be new staging branches coming out
of the sausage machine while the earlier composite is going.  We have to
work out what to do with those.  If the composite fails, these fresh
successes can be combined and tested just as the earlier one was; but,
if the earlier composite passes, we need to rebase the integrations that
have passed while it was tested.  However, all these have passed
previously, so we have fair confidence in them; so we can combine them
all together on the prior composite and set about testing this as a
second-stage composite integration.  So I think that has a good chance
of working well.

A note on merge vs rebase here: when merging several branches that have
passed first-integration, a conflict excludes a whole branch, though
it's probably caused by one or few of the commits in the branch; whereas
rebasing lets you detect which individual commits cause the conflicts,
so that only these get left out.  It also gives you a linear history.
Given that each staging branch is typically a jumble of mostly unrelated
commits (albeit there may be a few related commits in it), these
branches have no "semantic" relevance, so aren't valuable to keep in the
history.  So I'd encourage the use of rebase (discarding individual
commits when they conflict) rather than actual merges (discarding whole
branches on conflict); either way, it's what I mean by "combine" above.

Suppose Coin is capable of running N+1 integrations in parallel; then
we'll typically be doing one second-integration while (up to) N
first-integrations run in parallel.  If the second-stage one wins, the
(up to N) first-stage successes will be combined together on top of it;
otherwise, they just get combined as they are; either way, they form the
new second-stage integration.  This typically has a good chance of
success (for the same reason that Lars's "small risk" is indeed small);
but, when it fails, we haven't got ourselves into a broken state that
gums up the works until manual intervention fixes the problem.

So, in effect, Robert's model is Lars's with one of the N+1 integrations
it could have done in parallel being given up to do his "regular status
check" immediately to the result of his merge and the whole merge being
reverted if it fails.  Since we expect that to be rare, this reduces how
many integrations we can do in parallel by one to forestall Lars's "one
small risk" and ensure the tip is always good (Which Is Important).

So, modulo using rebase rather than merge to combine the first-stage
successes, I like Robert's model better,

	Eddy.