-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Emit events for all TaskRun lifecycle events #2329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| <!-- | ||
| --- | ||
| linkTitle: "Events" | ||
| weight: 2 | ||
| --- | ||
| --> | ||
| # Events | ||
|
|
||
| Tekton runtime resources, specifically `TaskRuns` and `PipelineRuns`, | ||
| emit events when they are executed, so that users can monitor their lifecycle | ||
| and react to it. Tekton emits [kubernetes events](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#event-v1-core), that can be retrieve from the resource via | ||
| `kubectl describe [resource]`. | ||
|
|
||
| No events are emitted for `Conditions` today (https://github.com/tektoncd/pipeline/issues/2461). | ||
|
|
||
| ## TaskRuns | ||
|
|
||
| `TaskRun` events are generated for the following `Reasons`: | ||
| - `Started`: this is triggered the first time the `TaskRun` is picked by the | ||
| reconciler from its work queue, so it only happens if web-hook validation was | ||
| successful. Note that this event does not imply that a step started executing, | ||
| as several conditions must be met first: | ||
| - task and bound resource validation must be successful | ||
| - attached conditions must run successfully | ||
| - the `Pod` associated to the `TaskRun` must be successfully scheduled | ||
afrittoli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - `Succeeded`: this is triggered once all steps in the `TaskRun` are executed | ||
| successfully, including post-steps injected by Tekton. | ||
| - `Failed`: this is triggered if the `TaskRun` is completed, but not successfully. | ||
| Causes of failure may be: one the steps failed, the `TaskRun` was cancelled or | ||
| the `TaskRun` timed out. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what if it was skipped? (maybe to be added?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A TaskRun is something that may only happen when a Pipeline is involved, so I did not look into that on this patch, where I focuses on events generated by the TaskRun controller.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah right, there's no taskrun at all, makes sense. those 2 skips are totally valid also. like you said, conditions aren't emitting events anyway, maybe we can revisit this then. thanks!! |
||
|
|
||
| ## PipelineRuns | ||
|
|
||
| `PipelineRun` events are generated for the following `Reasons`: | ||
| - `Succeeded`: this is triggered once all `Tasks` reachable via the DAG are | ||
| executed successfully. | ||
| - `Failed`: this is triggered if the `PipelineRun` is completed, but not | ||
| successfully. Causes of failure may be: one the `Tasks` failed or the | ||
| `PipelineRun` was cancelled. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,38 +21,45 @@ import ( | |
| "time" | ||
|
|
||
| corev1 "k8s.io/api/core/v1" | ||
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
| "k8s.io/client-go/tools/record" | ||
| "knative.dev/pkg/apis" | ||
| ) | ||
|
|
||
| func TestEmitEvent(t *testing.T) { | ||
| testcases := []struct { | ||
| name string | ||
| before *apis.Condition | ||
| after *apis.Condition | ||
| expectEvent bool | ||
| name string | ||
| before *apis.Condition | ||
| after *apis.Condition | ||
| expectEvent bool | ||
| expectedEvent string | ||
| }{{ | ||
| name: "unknown to true", | ||
| name: "unknown to true with message", | ||
| before: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionUnknown, | ||
| }, | ||
| after: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| Message: "all done", | ||
| }, | ||
| expectEvent: true, | ||
| expectEvent: true, | ||
| expectedEvent: "Normal Succeeded all done", | ||
| }, { | ||
| name: "true to true", | ||
| before: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| LastTransitionTime: apis.VolatileTime{Inner: metav1.NewTime(time.Now())}, | ||
| }, | ||
| after: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| LastTransitionTime: apis.VolatileTime{Inner: metav1.NewTime(time.Now().Add(5 * time.Minute))}, | ||
| }, | ||
| expectEvent: false, | ||
| expectEvent: false, | ||
| expectedEvent: "", | ||
| }, { | ||
| name: "false to false", | ||
| before: &apis.Condition{ | ||
|
|
@@ -63,23 +70,74 @@ func TestEmitEvent(t *testing.T) { | |
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionFalse, | ||
| }, | ||
| expectEvent: false, | ||
| expectEvent: false, | ||
| expectedEvent: "", | ||
| }, { | ||
| name: "unknown to unknown", | ||
| before: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionUnknown, | ||
| Reason: "", | ||
| Message: "", | ||
| }, | ||
| after: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionUnknown, | ||
| Reason: "foo", | ||
| Message: "bar", | ||
| }, | ||
| expectEvent: true, | ||
| expectedEvent: "Normal foo bar", | ||
| }, { | ||
| name: "true to nil", | ||
| after: nil, | ||
| before: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| }, | ||
| expectEvent: true, | ||
| expectEvent: false, | ||
| expectedEvent: "", | ||
| }, { | ||
| name: "nil to true", | ||
| before: nil, | ||
| after: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionTrue, | ||
| }, | ||
| expectEvent: true, | ||
| expectEvent: true, | ||
| expectedEvent: "Normal Succeeded ", | ||
| }, { | ||
| name: "nil to unknown with message", | ||
| before: nil, | ||
| after: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionUnknown, | ||
| Message: "just starting", | ||
| }, | ||
| expectEvent: true, | ||
| expectedEvent: "Normal Started ", | ||
| }, { | ||
| name: "unknown to false with message", | ||
| before: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionUnknown, | ||
| }, | ||
| after: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionFalse, | ||
| Message: "really bad", | ||
| }, | ||
| expectEvent: true, | ||
| expectedEvent: "Warning Failed really bad", | ||
| }, { | ||
| name: "nil to false", | ||
| before: nil, | ||
| after: &apis.Condition{ | ||
| Type: apis.ConditionSucceeded, | ||
| Status: corev1.ConditionFalse, | ||
| }, | ||
| expectEvent: true, | ||
| expectedEvent: "Warning Failed ", | ||
| }} | ||
|
|
||
| for _, ts := range testcases { | ||
|
|
@@ -90,12 +148,21 @@ func TestEmitEvent(t *testing.T) { | |
|
|
||
| select { | ||
| case event := <-fr.Events: | ||
| if ts.expectEvent && event == "" { | ||
| t.Errorf("Expected event but got empty for %s", ts.name) | ||
| if event == "" { | ||
| // The fake recorder reported empty, it should not happen | ||
| t.Fatalf("Expected event but got empty for %s", ts.name) | ||
| } | ||
| case <-timer.C: | ||
| if !ts.expectEvent { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i'm a bit confused by the cases that don't require an event, do you think it might make sense to have an event every time the condition changes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only cases where we do not emit an event are:
For the former I think we do not want an event, it may be we wrote the condition again and updated LastTransitionTime, but nothing really happened to the Condition, so no event. |
||
| t.Errorf("Unexpected event but got for %s", ts.name) | ||
| // The fake recorder reported an event which we did not expect | ||
| t.Errorf("Unxpected event \"%s\" but got one for %s", event, ts.name) | ||
| } | ||
| if !(event == ts.expectedEvent) { | ||
| t.Errorf("Expected event \"%s\" but got \"%s\" instead for %s", ts.expectedEvent, event, ts.name) | ||
| } | ||
| case <-timer.C: | ||
| if ts.expectEvent { | ||
| // The fake recorder did not report, the timer timeout expired | ||
| t.Errorf("Expected event but got none for %s", ts.name) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,6 +44,7 @@ const ( | |
| resyncPeriod = 10 * time.Hour | ||
| ) | ||
|
|
||
| // NewController instantiates a new controller.Impl from knative.dev/pkg/controller | ||
afrittoli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| func NewController(images pipeline.Images) func(context.Context, configmap.Watcher) *controller.Impl { | ||
| return func(ctx context.Context, cmw configmap.Watcher) *controller.Impl { | ||
| logger := logging.FromContext(ctx) | ||
|
|
@@ -66,6 +67,7 @@ func NewController(images pipeline.Images) func(context.Context, configmap.Watch | |
| ConfigMapWatcher: cmw, | ||
| ResyncPeriod: resyncPeriod, | ||
| Logger: logger, | ||
| Recorder: controller.GetEventRecorder(ctx), | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this just for testing? that's a bit unfortunate if so :( probably worth it tho? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, injecting fakes in the context is the same mechanism that we use to inject fake k8s client-go and other fakes. |
||
| } | ||
|
|
||
| entrypointCache, err := pod.NewEntrypointCache(kubeclientset) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -103,11 +103,16 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error { | |
|
|
||
| // If the TaskRun is just starting, this will also set the starttime, | ||
| // from which the timeout will immediately begin counting down. | ||
| tr.Status.InitializeConditions() | ||
| // In case node time was not synchronized, when controller has been scheduled to other nodes. | ||
| if tr.Status.StartTime.Sub(tr.CreationTimestamp.Time) < 0 { | ||
| c.Logger.Warnf("TaskRun %s createTimestamp %s is after the taskRun started %s", tr.GetRunKey(), tr.CreationTimestamp, tr.Status.StartTime) | ||
| tr.Status.StartTime = &tr.CreationTimestamp | ||
| if !tr.HasStarted() { | ||
| tr.Status.InitializeConditions() | ||
| // In case node time was not synchronized, when controller has been scheduled to other nodes. | ||
bobcatfish marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| if tr.Status.StartTime.Sub(tr.CreationTimestamp.Time) < 0 { | ||
| c.Logger.Warnf("TaskRun %s createTimestamp %s is after the taskRun started %s", tr.GetRunKey(), tr.CreationTimestamp, tr.Status.StartTime) | ||
| tr.Status.StartTime = &tr.CreationTimestamp | ||
| } | ||
| // Emit events | ||
| afterCondition := tr.Status.GetCondition(apis.ConditionSucceeded) | ||
| reconciler.EmitEvent(c.Recorder, nil, afterCondition, tr) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i'm a bit confused why this call to is it possible we can get away with one? i do like the idea of only having to call this logic once instead of being aware of 2 places it is called or if it is required maybe a comment explaining why? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You need to emit the event every time the condition changes, because - as things are today - you could have more than one condition change in one reconcile cycle. That could change if we added extra conditions in future, i.e. instead of only using Succeeded (unknown/true/false) we could use Started and Running too. In any case, I think it's worth sending the event as early as possible. This reconcile cycle will go on to create a pod and run several API calls, so we may be losing a few hundreds milliseconds if we wait. If something is looking for the event to update the status of a check in github, half a second later makes a difference. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
sounds like that's only in the started case? going from started to running - and shouldn't be expected from any other states? anyway makes sense! i do think this is worth a comment explaining why the 2 calls but also wouldnt block on this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, it's only in that case. I promise I'll add a comment in a follow-up 👼 |
||
| } | ||
|
|
||
| // If the TaskRun is complete, run some post run fixtures when applicable | ||
|
|
@@ -200,7 +205,6 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error { | |
| if err = c.reconcile(ctx, tr, taskSpec, rtr); err != nil { | ||
| c.Logger.Errorf("Reconcile error: %v", err.Error()) | ||
| } | ||
|
|
||
| // Emit events (only when ConditionSucceeded was changed) | ||
| after := tr.Status.GetCondition(apis.ConditionSucceeded) | ||
| reconciler.EmitEvent(c.Recorder, before, after, tr) | ||
|
|
@@ -337,6 +341,9 @@ func (c *Reconciler) reconcile(ctx context.Context, tr *v1alpha1.TaskRun, | |
| if k8serrors.IsNotFound(err) { | ||
| // Keep going, this will result in the Pod being created below. | ||
| } else if err != nil { | ||
| // This is considered a transient error, so we return error, do not update | ||
| // the task run condition, and return an error which will cause this key to | ||
| // be requeued for reconcile. | ||
afrittoli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| c.Logger.Errorf("Error getting pod %q: %v", tr.Status.PodName, err) | ||
| return err | ||
| } | ||
|
|
@@ -401,7 +408,6 @@ func (c *Reconciler) reconcile(ctx context.Context, tr *v1alpha1.TaskRun, | |
| } | ||
|
|
||
| c.Logger.Infof("Successfully reconciled taskrun %s/%s with status: %#v", tr.Name, tr.Namespace, tr.Status.GetCondition(apis.ConditionSucceeded)) | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.