[Development] Proposal: New branch model

Wed Jan 23 21:42:35 CET 2019

Jedrzej Nowacki wrote:
>>  Advantages:
>>  - no waiting for merges, a fix can be used right away after
>>    integration
>>  - faster propagation of fixes targeting all branches, as there are
>>    no merges of merges

Alex Blasche (23 January 2019 18:09)
> This is pure speculation because you potentially triple (or worse) the
> amount of individual patches requiring a merge in gerrit when you
> consider that you want to at least merge to 5.9, 512, dev and qt6. I
> don't see this prediction come true.

Well, as soon as it hits dev, the patch is cherry-picked to every branch
that its footer says it belongs in.  Those branches where all goes well
see it one integration later.  Each branch that has a conflict needs
that resolved before we get to that second integrtion.  Contrast this
with a 5.12 -> 5.13 -> dev chain of merges, where dev doesn't get the
change that landed in 5.12 (even if that change could land in dev
without conflict) until
 * there's been some delay between their change being accepted in 5.12
   and the next merge-bot run
 * everyone who made change to 5.12 that conflicted on merging to 5.13
   has advised Liang on how to resolve their conflicts
 * we've got the result through integration into 5.13
 * everyone who's made changes to 5.13 or (as possibly just amended in
   merging) 5.12 that conflicts with anything in dev has again advised
   how to resolve their conflicts
 * and we've got the result through a second integration, into dev.

When nothing but the change being considered has a conflict along
the way, that's twice as long; and any change to an upstream branch,
that does have a conflict, introduces delay for all the other changes
that landed in that branch, even if they don't have conflicts.  In the
middle of summer, when lots of folk are away on holiday, getting help
with resolving conflicts isn't easy - the folk who know those commits
won't be back for a month - and all changes, no matter how urgent, get
stuck behind any conflict we can't work out how to resolve.

So no, Jedrzej's claim is not *pure* speculation; it's at least quite a
lot adulterated with reasons to believe that many changes would, under
his scheme, propagate to all branches they're destined for sooner than
happens with our present scheme.

Of course, the originator of the commit has to resolve any conflicts in
all branches it lands in - but can do so immediately, not one today,
another in mumble days time when the next merge happens, etc. - and has
to whack that Stage button repeatedly on each of those target branches
after review until they get accepted - but these happen in parallel.  So
the individual patches requiring a merge, of which you note there may be
thrice as many, can indeed happen; but they happen only to each patch
separately; if the author of that patch or its reviewers just went on
holiday, they don't delay anyone else's patch's progress.  They can also
be handled in parallel - I can fix all my conflict resolutions at the
same time, my favourite reviewer on the other side of the planet can
review them before I come in the next day, and I can play whack-a-mole
with them all in parallel after that.

Each branch is only one step away from dev.  My change can realistically
hit them all the day after it lands in dev; this is in stark contrast to
a change I've just put into 5.12 (only *one* step away from dev at
present) and I don't know how long I'll have to wait until it reaches
dev and I can follow up with the related fixes that are dev-only
material.  So I have to watch merges until that happens.

My main concern is that all those conflict resolutions are going to need
reviewed (which they do at present, so that's not changed) and they
won't be part of a collective rate-limiting step.  Those helping Liang
resolve conflicts are encouraged to do so by the knowledge that the
merge is holding up lots of other things.  Those reviewing an amended
cherry-pick won't have the same urgency.  Then again, this'll happen
right after they +2ed the original that was cherry-picked.  It should
all be fresh in their minds (except if they went on holiday in between;
but then only that one change suffers, not everything).

> BTW how does this work for change files? I hope we don't have to look
> in release tags to find the change log for a particular release.

Change files happen on non-dev branches, for sure.
As Jedrzej's (**) pointed out, branch-specific changes can still happen.
There just won't be any cherry-picking from them.
And I guess we'll still merge 5.x.y, once it releases, into 5.x.

>>  - simpler for new contributors, always push to dev

> Really? Me being the new guy wanting to fix a bug in 5.12 need to
> magically know that I have to push to dev and know about a magic
> cherry-pick logic and a magic tag in the commit log.

Indeed, the proposed model isn't ideal for new contributors.
However, reviewers can explain it to them.
That's one part of the job of review, after all -
indoctrinating the newcomers.

>>  - in most cases a reviewer does no need to know what the current
>>    version number is

> But he does, because you require the right commit tag in the log,
> don't you?

Note that the footer he talked about was expressed in abstract
terms, not in terms of specific branches - dev, stable, LTS, LTS-strict -
so that the script takes care of determining which actual version
numbers match each of those abstract terms.  In particular, given how
long it can take to get anyone to pay any attention to your change, much
less review it, this would put an end to having to repeatedly move one's
changes from branch to branch as they get released without anyone taking
the time to review your changes.  Or don't you have that problem ?
I do.  Regularly.

Anyway, no, you don't require the commit tag in the log.  You need an
abstract descriptor of what branch-types this change should land in.  If
your patch is delayed in review while several versions are released, you
don't have to retarget your change or update the detination-branch
footer.  It won't land in the branch you may have been hoping for, that
got released while no-one was reviewing your change, but it'll land in
the branches that are allowed the level of risk your reviewers believe
your change incurs.

>>  - conflict resolution is distributed and affects mostly the author
>>    of the change

> The merge problems we have today don't disappear. The merge problem is
> shifted from full time developers to once in a life time contributors.

That is true.  Equally, the merge problem is shifted from someone who's
never seen the code with the conflict in it to someone who just made the
change that caused one side of the conflict.  So, you know, swings and
roundabouts.

> The once in a lifetime contributor might not care anymore.

OTOH, this process forces them to get the change into dev; and if they
care about it landing in some other branch, they'll be motivated to (at
least) ensure it gets there.  Contrast that with the present model,
where their change lands where they want it and, if they lose interest
after that, Liang is left holding their baby.

> To me this sounds like a way to shift the hard problem to somebody who
> has the least knowledge, commitment and time.

As above - swings and roundabouts.

They have more knowledge of the code being changed than Liang.  Then
again, Liang has more practice at making sense of conflicts and
resolving them.  He also has colleagues with similar expertise that he
can turn to for help, when needed.

The author of the commit probably has more commitment to that commit
than Liang does.  Liang has more of a commitment to keeping the merges
flowing.

None of us has any time.
So no difference there.

> The loser will be the Qt stable/lts branches.

Again, swings and roundabouts.  At present, I suspect we have things
landing in stabler branches than they really should.  I can't quote
statistics on that, but it's a suspicion I believe others share.
So what stable/LTS may be losing is: instability.

Consider the case:
Is it really so crucial that this change gets into LTS ?

Even if it isn't, under the present system, the author points it there
when starting review.  Maybe a reviewer will object that it shouldn't go
there - in which case they may simply keep it as their own local patch
and we don't get it at all.  A reviewer who thinks about the target
version when first looking at it and decides it's fair enough might have
decided otherwise if that decision had been made two months later, when
the kinks have all been ironed out and it gets the +2 that leads to it
being accepted.  That earlier decision biases the reviewer's judgement
later, when it's closer to the release and a more conservative decision
might have been apt.  Like I say, I can't quantify how much this
happens, but I know humans have cognitive biases like this, that could
well mean we're sometimes letting things into branches they shouldn't
reach.

Under the proposed scheme, it lands in dev.
The footer has said how stable a branch it can get picked to.
That stability assessment is made at the time the change hits dev.
The author might not be committed enough to it to make sure it lands
everywhere else it should; but probably will be committed enough to at
least land it in the branch they want it in - as long as that's one of
those the tags their reviewers accepted still refer to, of course.

> Add the explosion of changes towards the CI.

That I'm also concerned about.
Then again, CI is working better now.
Jedrzej has worked with CI, which I haven't, so
I'll trust his judgement that CI can cope.

> True, the distribution is a plus. can we possibly apply this thought
> to the current approach?  How about we distribute the current merge
> coordination among more experienced developers.

Well, to some degree, we do - Liang's bot Cc's folk into the review that
some scripts think are implicated in the conflicts, so that they can
help resolve the conflicts.  Unfortunately, if one of them is
unavailable, someone who knows the relevant changes less well has to
make an educated guess at how to resolve the conflict.

Distributing the "being the owner of the process" part has its appeal;
but it would still be distributing it among folk who don't know the
changes in which each conflicts have happened - and they'd probably have
less experience with merges than Liang does, and wouldn't know which of
their peers have such experience, to turn to for help.  So we might get
broken merges more often.

> If you combine this with simpler dependency management were we don't
> have to wait for qt5.git to merge, the merge task distribution becomes
> easier.

I suspect you of having just brushed A Hard Problem under the rug.  I
can believe there are ways we could handle dependencies that would make
the job easier.  However, I don't know what they are.  Absent a concrete
plan on that, I'm going to quote you back at yourself:

> This is pure speculation

Back to Jedrzej:
>>  - there is a chance, that some cherry-picked commits may be left
>>    forgotten in gerrit after a failed integration

> I see this as a serious problem and it was one of the biggest if not
> *the* advantage of the current system. And the most experienced devs
> were responsible for ensuring it.

That one is a concern.  OTOH, see above under "does this change really
belong in LTS" ?  If a failed integration (or twenty) put someone off
chasing their change through to some other branch
 * we still at least got it into dev and
 * perhaps their lack of commitment is a sign that it really isn't that
   crucial that it land in LTS, or wherever it didn't reach.

I won't claim to be totally sold on the new plan, but I'm not convinced
against it by the objections raised thus far.

Finally, here's how I find things with change-ID $ID:

  git branch | tr -d '* ' | while read b
  do git log --grep $ID dev..$b
     echo $b
  done

Replace line-breaks with semicolons to taste, as ever.  It could surely
be cleverer, but I'm fine with eyeballing the output to see which
branch-names appear after a git log entry; and I get my commit-IDs for
free in the process.

	Eddy.