-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Expected Behavior
Background: While debugging an issue that caused duplicate reconcile events root cause here, I noticed that if a pod was deemed non-existent at the time
of the list, then when we went to create it later, it may have already been created, and the pod create
failed as expected with AlreadyExists error. However, I was a bit surprised that this lead to TaskRun
failing permanently (and with a slightly confusing error at first (the part about missing or invalid task)):
status:
completionTime: "2022-04-04T16:22:28Z"
conditions:
- lastTransitionTime: "2022-04-04T16:22:28Z"
message: 'failed to create task run pod "sbom-syft-2xvwm": pods "sbom-syft-2xvwm-pod"
already exists. Maybe missing or invalid Task syft-sboms/sbom-syft'
reason: CouldntGetTask
status: "False"
type: Succeeded
From here
So, I played around with this a bit, and I made that error not be fatal, and I also upon getting the create error
tried to fetch from informer cache (related to #4740). And things worked just fine (despite there being another
bug, linked to above in Knative pkg).
Anyways, wanted to see if we might want to consider making the pod creating as a transient failure instead
of permanent.
Just wanted to see how folks feel about it.
Actual Behavior
Steps to Reproduce the Problem
Additional Info
-
Kubernetes version:
Output of
kubectl version:(paste your output here) -
Tekton Pipeline version:
Output of
tkn versionorkubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'