Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 19 additions & 7 deletions teps/0090-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ see-also:
- [Alternatives](#alternatives)
- [Fan Out](#fan-out)
- [Concurrency Control](#concurrency-control)
- [Failure Strategies](#failure-strategies)
- [Design](#design)
- [Parameters](#parameters)
- [Substituting String Parameters in the Tasks](#substituting-string-parameters-in-the-tasks)
Expand Down Expand Up @@ -206,15 +207,13 @@ specified in a `matrix`.

The following are out of scope for this TEP:

1. Terminating early when one of the `TaskRuns` or `Runs` created in parallel fails. As is currently, running `TaskRuns`
and `Runs` have to complete execution before termination.
2. Configuring the `TaskRuns` or `Runs` created in a given `matrix` to execute sequentially. This remains an option
1. Configuring the `TaskRuns` or `Runs` created in a given `matrix` to execute sequentially. This remains an option
that we can explore later.
3. Excluding generating a `TaskRun` or `Run` for a specific combination in the `matrix`. This remains an option we can
2. Excluding generating a `TaskRun` or `Run` for a specific combination in the `matrix`. This remains an option we can
explore later if needed.
4. Including generating a `TaskRun` or `Run` for a specific combination in the `matrix`. This can be handled by adding
3. Including generating a `TaskRun` or `Run` for a specific combination in the `matrix`. This can be handled by adding
the items that produce that combination into the `matrix`. This remains an option we can explore later if needed.
5. Supporting producing `Results` from fanned out `PipelineTasks`. We plan to address this after [TEP-0075][tep-0075]
4. Supporting producing `Results` from fanned out `PipelineTasks`. We plan to address this after [TEP-0075][tep-0075]
and [TEP-0076][tep-0076] have landed.

### Requirements
Expand Down Expand Up @@ -718,6 +717,18 @@ If needed, we can also explore providing more granular controls for maximum numb
or `Runs` from `Matrices` - either at `PipelineRun`, `Pipeline` or `PipelineTask` levels - later.
This is an option we can pursue after gathering user feedback - it's out of scope for this TEP.

### Failure Strategies

In failure scenarios, the `TaskRuns` or `Runs` created from a `Matrix` will fail fast. That is, when
any `TaskRun` or `Run` from a given fanned-out `PipelineTask` fails or is cancelled then the other
`TaskRuns` or `Runs` from the same `PipelineTask` will be cancelled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jerop, looking for more clarification, how are we failing already running taskRuns or runs from the same pipelineTask? Sending termination signal?

Can we implement something like stopping mode by default, the similar approach we take when a task in a pipeline fails?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason I am asking is I am assuming stopping is easier to implement 🤣 And does not result in partial execution ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pritidesai yes, similarly to how we cancel TaskRuns for example - it may be easier to implement stopping mode but wondering if that's the behavior we want for Matrix separate from the implementation - maybe partial execution could be the reason not to do this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stopping will be consistent to how we address failure of a non-matrix pipelineTask.

Just an example, if building an image of one application fails, I do not want that failure to stop building images of 20 other applications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense - added this to the API WG agenda on Monday so we can decide on the best way forward


This approach makes it easier to control the execution of the many `TaskRuns` or `Runs` that can be
created from a `Matrix` (up to 256 for now). Moreover, failing fast is the default behavior of `Matrix`
in other Continuous Delivery systems - see [GitHub Actions - Handling Failures in Matrix][ghm-failfast].

If needed, we can explore supporting other failure strategies later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right call - in Jenkins Pipeline, we inherited certain assumptions around failure behavior for parallel executions (both in a matrix and not) which resulted in us having to support making fail-fast configurable, and I don't think it was worth doing that in the end. When we're starting from scratch, like here, we have the opportunity to say "this is how things work" and adjust if sufficient demand comes up in the future, rather than prematurely optimizing. So, yeah. 👍


## Design

In this section, we go into the details of the `Matrix` in relation to:
Expand Down Expand Up @@ -1647,4 +1658,5 @@ However, this approach has the following disadvantages:
[git-clone]: https://github.com/tektoncd/catalog/tree/main/task/git-clone/0.5
[when]: https://github.com/tektoncd/pipeline/blob/6cb0f4ccfce095495ca2f0aa20e5db8a791a1afe/docs/pipelines.md#guard-task-execution-using-when-expressions
[retries]: https://github.com/tektoncd/pipeline/blob/6cb0f4ccfce095495ca2f0aa20e5db8a791a1afe/docs/pipelines.md#using-the-retries-parameter
[timeouts]: https://github.com/tektoncd/pipeline/blob/6cb0f4ccfce095495ca2f0aa20e5db8a791a1afe/docs/pipelines.md#configuring-the-failure-timeout
[timeouts]: https://github.com/tektoncd/pipeline/blob/6cb0f4ccfce095495ca2f0aa20e5db8a791a1afe/docs/pipelines.md#configuring-the-failure-timeout
[ghm-failfast]: https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs#handling-failures