RFC: `bors r+ squash` to squash commits in a *specific* PR

Summary: Add bors squash command to squash a specific PR. Unlike the existing use_squash_merge option, this will allow to select squash or merge on the case-by-case basis.

I allow this RFC document to be modified and redistributed under the terms of the Apache-2.0 license, or the CC-BY-NC-SA 3.0 license, at your option.

Motivation

Some projects want to maintain a "clean" commit history on a best-effort basis. For example, for open source projects maintainers typically pay a lot of attention to crafting a good sequence of commits for a pull request, while less experience contributors often end up with "semanticless" merge and fixup commits. For these cases, it would be useful to merge one PR as is, and squash the other down to a single PR. It is possible to ask the contributor to do the required git fu, but this is pretty hard and discouraging, there's no reason why a robot can't do that!

Note that today, using the use_squash_merge option, it is possible to squash all pull requests. This solves the fixup commits problem. However, it prevents expert git users from creating a good multi-commit history for larger PRs.

Guide-level explanation

Add squash modifier to bors r+ / bors merge, such that it also squashes all PR's commits to one.

Reference-level explanation

Grammar: bors r+ [squash]

Drawbacks

  • this adds one more choice the user can/has to make for every PR.

Rationale and alternatives

TBD

Prior art

  • When merging PR via GitHub UI, there's a similar squash and merge option.

Unresolved questions

  • What is the interaction between bors squash and use_squash_merge? Do we need a nosquash modifier to supress squashing in a specific case when use_squash_merge is set to true?
  • Should bors squash be supported as a shorthand notation for bors r+ squash?

Future possibilities

TBD

See also

Tracking issue: https://github.com/bors-ng/bors-ng/issues/1028

1 Like

How does it interact with PRs that aren't squashing during merge? This sounds like it could
really complicate the merging code.

As a user of bors, I'd expect that any PRs that are being squashed wouldn't be able to be batched with other PRs. Either that, or you would end up with a linear branch with one commit per batched PR.

1 Like

I would like something at the opposite.
We use use_squash_merge by default, but for some long feature branch, we would like to "rebase and merge" it.

Do you think this would also solve the use case for Allow controlling squash behaviour by using a bors command ?

Yeah, I believe these two requests are essentially duplicates!

Applying this solution instead of the one I proposed here would also work. I don't mind as long as there is a way to optionally squash or not. I prefer my solution because it it makes it easier that the default is squashing but whatever people think is best would work, I don't mind

That is definitely something that needs to be defined before this RFC is accepted.

I am wondering whether we can make a generic solution for the problem where the default in the config file is a certain strategy and we want to use another strategy for the PR.

1 Like

This thread has been inactive for a bit and am just wondering if its still being worked on? (And whether we are just waiting for someone to try to implement this, possibly an easy to implement version among all the ones discussed but not decided.) There's a request from my company for having this as well.

I don't know if nosquash makes sense given a global use_squash_merge option. But bors r+ squash with no use_square_merge could squash merge the PR into the staging branch.

I am not aware about someone actively working on it. I'd still love to have this feature, but I won't have resources to implement it myself.

I've got a narrow window of time where I could try to implement something, though I'm looking at an initial version that could be improved later. I see the current code for doing squash is a bit involved. I'll see if some of it can be factored out and adapted.

Yeah, I'd probably approve whichever RFC got implemented. Both seem acceptable.

I started implementing this but realized no-one between the two threads has offered designs for how this interacts with batching.

Currently,

  • without squash, a new commit is created with all PR heads from the batch as parent (i.e., multiple parents)
  • with squash, a staging branch is created and all PRs are squash-merged into the staging branch one after the other (i.e., a fishbone structure with chains of single parents).

I'm thinking either

  1. Have a fishbone structure if there's at least one PR that wants squash-merge (but don't squash PRs that don't want squash merge). Or
  2. Merge all PRs that want squash into the staging branch, create a new commit with parents the staging branch and all PRs that don't want squash merge.

A bit ugly in either case I'm afraid. But once the infrastructure is there, we might also be able to switch to something better later on if someone comes up with it.

While trying to understand what the current code does, I've noticed a lot of duplication around squash.

  1. We really only use one of "#{batch.project.staging_branch}.tmp" or "#{batch.project.staging_branch}-squash.tmp". We're doing (or asking Github to do) the work of merging the patching into a tmp branch twice when squashing but the first branch is only used to extract a toml and the base branch's toml seems like it would work just as well (toml changing PRs would be slightly affected.)
  2. :synthesize_commit and :create_commit are the same save for the force_push at the end back to the staging branch.

I'm going to deduplicate a bit before trying to add new changes in.

1 Like