Bug: Wildcard Status Ignores > 1 match

Hi,

My team is using public bors-ng with Cirrus-CI, which posts to the newer github checks API. I setup my CI task names and confirmed with API calls what was actually used in the check-run JSON. For example, my bors.toml might say:

status = [
    "cirrus-ci/lint%",
    "cirrus-ci/test%",
]

There's a matrix feature in Cirrus-CI, whereby it will append the distinguishing axis elements to the end of the name. For example, a snip from .cirrus.yml might include:

...
'cirrus-ci/test_task':
    env:
        matrix:
            blah: foo
            blah: bar
...

This would result in two tasks, one named cirrus-ci/test blah=foo and another named cirrus-ci/test blah=bar. These are also the names used in the checks JSON. However I noticed that bors-ng ignores wildcard matches beyond the first (whichever that happens to be).

For example, (using the tasks from the yaml above) if the foo task was successful (and sorted first in the JSON) but the bar task failed, bors posts:

Build succeeded

  • cirrus-ci/lint
  • cirrus-ci/test blah=foo

Which is completely a bad, awful, horrible, false-positive since cirrus-ci/test blah=bar actually failed. I am presuming this is a bug given this seems likely to be undesirable behavior.

In order to make sense if the current behavior, it helps to know why it was added. Your goal is to have bors dynamically figure out your entire build matrix from just a couple of wildcard rules, but the feature's original intent was to have a single matching status line that just happens to have an unpredictable name (because the status report contains the name of the machine that built it, or the commit hash, or the branch name, or something similarly unpredictable).

One of the things bors has to be able to deal with is when CI jobs don't start at all. Imagine that cirrus-ci/test blah=foo went fine, but when Cirrus tried to create a record for cirrus-ci/test blah=bar, their API call timed out. So the first one would succeed, but the second wouldn't exist, so it would count against the batch. We really don't want network failures to cause a spurious success.

To make the feature less misleading, we should probably change bors so that it'll produce a warning when multiple status lines match a single wildcard record. This way, it would be less confusing.


To solve your specific problem, there are basically two solutions:

  • Could you get Cirrus to create a "summary status" that encapsulates all of your builds? Like creating an entry in your pipeline that runs after the rest of the build matrix?

  • The best alternative, which is what some people do, is to just list their entire build matrix out in their bors.toml, without using wildcards.

current behavior, it helps to know why it was added

Ohh yes that it helpful, thanks for sharing the link.

We really don't want network failures to cause a spurious success .

Yes I agree, that's also bad :smiley:

create a "summary status" that

This is the solution we're currently using and it seems to work well.

The ultimate thing I was trying to solve is, that matrices can be quite large, and the names can get quite long as well. So maintaining the list of status for bors would be burdensome to maintain.

To have cake and eat it, (required number of status and wildcard names) maybe have a setting to specify the minimum number of required successful checks?

That would also enable a perhaps lesser-used workflow, where you have a large number of checks but only ever care if "most" of them pass (non-specifically).

Anyway, thanks for the help :smiley: