-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Http request instrumentation with Activity (correlation part2) #16393
Changes from all commits
76dabc6
707fa0d
8dcc776
573530c
dcf6409
0916191
9815a34
1ee9ac6
0706536
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| # HttpClient Diagnostic Instrumentation Users Guide | ||
|
|
||
| This document describes `HttpClientHandler` instrumentation with [DiagnosticSource](https://github.com/dotnet/corefx/blob/master/src/System.Diagnostics.DiagnosticSource/src/DiagnosticSourceUsersGuide.md). | ||
|
|
||
| # Overview | ||
| Applications typically log outgoing HTTP requests; typically it's done with DelegatingHandler implementation that logs every request. However in existing system it may require extensive code changes, since DelegatingHandler needs to be added to HttpClient pipeline every time when new client is created. | ||
| DiagnosticListener instrumentation allows to enable outgoing request tracing with a few lines of code; it also provides context necessary to correlate logs. | ||
|
|
||
| Context is represented as `System.Diagnostics.Activity` class. Activity may be started as a child of another Activity and the whole operation is represented with a tree of Activities. You can find more details in [Activity User Guide](https://github.com/dotnet/corefx/blob/master/src/System.Diagnostics.DiagnosticSource/src/ActivityUserGuide.md). | ||
|
|
||
| Activity carries useful properties for logging such as Id, start time, Tags, Baggage and parent information. | ||
|
|
||
| Instrumentation ensures `Activity.Current` represents current outgoing request in Write event callbacks (if request is instrumented). Consumers **should not** assume `Activity.Current` is accurate in IsEnabled callbacks. | ||
|
|
||
| In microservice environment some context should flow with outgoing requests to correlate telemtry from all services involved in operation processing. | ||
| Instrumentation adds context into the request headers: | ||
| * Request-Id header with `Activity.Id` value | ||
| * Correlation-Context header with `Activity.Baggage` key-value pair list in `k1=v1, k2=v2` format | ||
|
|
||
| See [Http Protocol proposal](https://github.com/lmolkova/correlation/blob/master/http_protocol_proposal_v1.md) for more details | ||
| [comment]: TODO: Update link once it's moved | ||
|
|
||
| ## Subscription | ||
| Instrumentation is off by default; to enable it, consumer first needs to subscribe to DiagnosticListener called "HttpHandlerDiagnosticListener". | ||
|
|
||
| ```C# | ||
| var subscription = DiagnosticListener.AllListeners.Subscribe(delegate (DiagnosticListener listener) | ||
| { | ||
| if (listener.Name == "HttpHandlerDiagnosticListener") | ||
| { | ||
| listener.Subscribe(delegate (KeyValuePair<string, object> value) | ||
| { | ||
| Console.WriteLine($"Event: {value.Key} ActivityName: {Activity.Current.OperationName} Id: {Activity.Current.Id} "); | ||
| }); | ||
| } | ||
| } ); | ||
| ``` | ||
|
|
||
| ## Events | ||
| If there is **at least one consumer**, subscribed for "HttpHandlerDiagnosticListener" events, `HttpClientHandler` instruments outgoing request depending on subscription and request properties. | ||
|
|
||
| Avoid having more than one consumer for DiagnosticListener. Note that DiagnosticListener best practice is to guard every `Write` method with `IsEnabled` check. In case there is more than one consumer, **each consumer** will receive Write event if **at least one** consumer returned true for `IsEnabled`. | ||
|
|
||
| #### IsEnabled: System.Net.Http.Activity | ||
| If there is a consumer, instrumentation calls `DiagnosticListener.IsEnabled("System.Net.Http.Activity", request)` to check if particular request needs to be instrumented. | ||
| Consumer may optionally provide predicate to DiagnosticListener to prevent some requests from being instrumented: e.g. if logging system has HTTP interface, it could be necessary to filter out requests to logging system itself. | ||
|
|
||
| ```C# | ||
| var predicate = (name, r, _) => | ||
| { | ||
| var request = r as HttpRequestMessage; | ||
| if (request != null) | ||
| { | ||
| return !request.RequestUri.Equals(ElasticSearchUri); | ||
| } | ||
| return true; | ||
| } | ||
| listener.Subscribe(observer, predicate); | ||
| ``` | ||
| ### System.Net.Http.Activity.Start | ||
| After initial instrumentation preconditions are met and `DiagnosticListener.IsEnabled("System.Net.Http.Activity", request)` check is passed, instrumentation starts a new Activity to represent outgoing request. | ||
| If **"System.Net.Http.Activity.Start"** is enabled, instrumentation writes it. Event payload has Request property with `HttpRequestMessage` object representing request. | ||
|
|
||
| ### System.Net.Http.Activity.Stop | ||
| When request is completed (faulted with exception, cancelled or successfully completed), instrumentation stops activity and writes **"System.Net.Http.Activity.Stop"** event. | ||
|
|
||
| Event payload has following properties: | ||
| * **Response** with `HttpResponseMessage` object representing response, which could be null if request was failed or cancelled. | ||
| * **RequestTaskStatus** with `TaskStatus` enum value that describes status of the request task. | ||
|
|
||
| This event is sent under the same conditions as "System.Net.Http.Activity.Start" event. | ||
|
|
||
| #### IsEnabled: System.Net.Http.Exception | ||
| If request processing causes an exception, instrumentation first checks if consumer wants to receive Exception event. | ||
|
|
||
| ### System.Net.Http.Exception | ||
| If request throws an exception, instrumentation sends **"System.Net.Http.Exception"** event with payload containing `Exception` and `Request` properties. | ||
|
|
||
| If current outgoing request is instrumented, `Activity.Current` represents it's context. | ||
| Otherwise, `Activity.Current` represent some 'parent' activity (presumably incoming request). | ||
|
|
||
| # Events Flow and Order | ||
|
|
||
| 1. `DiagnosticListener.IsEnabled()` - determines if there is a consumer | ||
| 2. `DiagnosticListener.IsEnabled("System.Net.Http.Activity", request)` - determines if this particular request should be instrumented | ||
| 3. `DiagnosticListener.IsEnabled("System.Net.Http.Activity.Start")` - determines if Start event should be written | ||
| 4. `DiagnosticListener.Write("System.Net.Http.Activity.Start", new {Request})` - notifies that activity (outgoing request) was started | ||
| 5. `DiagnosticListener.IsEnabled("System.Net.Http.Exception")` - determines if exception event (if thrown) should be written | ||
| 6. `DiagnosticListener.Write("System.Net.Http.Activity.Exception", new {Exception, Request})` - notifies about exception during request processing (if thrown) | ||
| 7. `DiagnosticListener.Write("System.Net.Http.Activity.Stop", new {Response, RequestTaskStatus})` - notifies that activity (outgoing request) is stopping | ||
|
|
||
| # Non-Activity events (deprecated) | ||
| If there is a subscriber to "HttpHandlerDiagnosticListener", but Activity events are disabled (`DiagnosticListener.IsEnabled("System.Net.Http.Activity", request)` returns false), instrumentation attempts to send legacy "System.Net.Http.Request" and "System.Net.Http.Response" events if they are enabled. | ||
| Consumers should consider migrating to Activity events instead of Request/Response events. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,7 +2,9 @@ | |
| // The .NET Foundation licenses this file to you under the MIT license. | ||
| // See the LICENSE file in the project root for more information. | ||
|
|
||
| using System.Collections.Generic; | ||
| using System.Diagnostics; | ||
| using System.Net.Http.Headers; | ||
| using System.Threading; | ||
| using System.Threading.Tasks; | ||
|
|
||
|
|
@@ -28,36 +30,37 @@ internal static bool IsEnabled() | |
| return s_diagnosticListener.IsEnabled(); | ||
| } | ||
|
|
||
| protected internal override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, | ||
| protected internal override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, | ||
| CancellationToken cancellationToken) | ||
| { | ||
| //HttpClientHandler is responsible to call DiagnosticsHandler.IsEnabled() before forwarding request here. | ||
| //This code will not be reached if no one listens to 'HttpHandlerDiagnosticListener', unless consumer unsubscribes | ||
| //from DiagnosticListener right after the check. So some requests happening right after subscription starts | ||
| //might not be instrumented. Similarly, when consumer unsubscribes, extra requests might be instumented | ||
|
|
||
| Guid loggingRequestId = Guid.NewGuid(); | ||
| LogHttpRequest(request, loggingRequestId); | ||
| Activity activity = null; | ||
| Guid loggingRequestId = Guid.Empty; | ||
|
|
||
| Task<HttpResponseMessage> responseTask = base.SendAsync(request, cancellationToken); | ||
|
|
||
| LogHttpResponse(responseTask, loggingRequestId); | ||
| return responseTask; | ||
| } | ||
|
|
||
| #region private | ||
|
|
||
| private static readonly DiagnosticListener s_diagnosticListener = | ||
| new DiagnosticListener(DiagnosticsHandlerLoggingStrings.DiagnosticListenerName); | ||
|
|
||
| private static void LogHttpRequest(HttpRequestMessage request, Guid loggingRequestId) | ||
| { | ||
| if (s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.RequestWriteName)) | ||
| // If System.Net.Http.Activity is on see if we should log the start (or just log the activity) | ||
| if (s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.ActivityName, request)) | ||
| { | ||
| activity = new Activity(DiagnosticsHandlerLoggingStrings.ActivityName); | ||
| //Only send start event to users who subscribed for it, but start activity anyway | ||
| if (s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.ActivityStartName)) | ||
| { | ||
| s_diagnosticListener.StartActivity(activity, new { Request = request }); | ||
| } | ||
| else | ||
| { | ||
| activity.Start(); | ||
| } | ||
| } | ||
| //if Activity events are disabled, try to write System.Net.Http.Request event (deprecated) | ||
| else if (s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.RequestWriteNameDeprecated)) | ||
| { | ||
| long timestamp = Stopwatch.GetTimestamp(); | ||
|
|
||
| s_diagnosticListener.Write( | ||
| DiagnosticsHandlerLoggingStrings.RequestWriteName, | ||
| loggingRequestId = Guid.NewGuid(); | ||
| s_diagnosticListener.Write(DiagnosticsHandlerLoggingStrings.RequestWriteNameDeprecated, | ||
| new | ||
| { | ||
| Request = request, | ||
|
|
@@ -66,44 +69,85 @@ private static void LogHttpRequest(HttpRequestMessage request, Guid loggingReque | |
| } | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| private static void LogHttpResponse(Task<HttpResponseMessage> responseTask, Guid loggingRequestId) | ||
| { | ||
| responseTask.ContinueWith( | ||
| (t, s) => | ||
| // If we are on at all, we propagate any activity information. | ||
| Activity currentActivity = Activity.Current; | ||
| if (currentActivity != null) | ||
| { | ||
| request.Headers.Add(DiagnosticsHandlerLoggingStrings.RequestIdHeaderName, currentActivity.Id); | ||
| //we expect baggage to be empty or contain a few items | ||
| using (IEnumerator<KeyValuePair<string, string>> e = currentActivity.Baggage.GetEnumerator()) | ||
| { | ||
| long timestamp = Stopwatch.GetTimestamp(); | ||
|
|
||
| if (t.IsFaulted && s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.ExceptionWriteName)) | ||
| if (e.MoveNext()) | ||
| { | ||
| s_diagnosticListener.Write( | ||
| DiagnosticsHandlerLoggingStrings.ExceptionWriteName, | ||
| new | ||
| { | ||
| LoggingRequestId = (Guid) s, | ||
| Timestamp = timestamp, | ||
| Exception = t.Exception, | ||
| } | ||
| ); | ||
| var baggage = new List<string>(); | ||
| do | ||
| { | ||
| KeyValuePair<string, string> item = e.Current; | ||
| baggage.Add(new NameValueHeaderValue(item.Key, item.Value).ToString()); | ||
| } | ||
| while (e.MoveNext()); | ||
| request.Headers.Add(DiagnosticsHandlerLoggingStrings.CorrelationContextHeaderName, baggage); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if (s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.ResponseWriteName)) | ||
| Task<HttpResponseMessage> responseTask = base.SendAsync(request, cancellationToken); | ||
| try | ||
| { | ||
| await responseTask.ConfigureAwait(false); | ||
| } | ||
| catch (TaskCanceledException) | ||
| { | ||
| //we'll report task status in Activity.Stop | ||
| throw; | ||
| } | ||
| catch (Exception ex) | ||
| { | ||
| if (s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.ExceptionEventName)) | ||
| { | ||
| //If request was initialy instrumented, Activity.Current has all necessary context for logging | ||
| //Request is passed to provide some context if instrumentation was disabled and to avoid | ||
| //extensive Activity.Tags usage to tunnel request properties | ||
| s_diagnosticListener.Write(DiagnosticsHandlerLoggingStrings.ExceptionEventName, new { Exception = ex, Request = request }); | ||
| } | ||
| throw; | ||
| } | ||
| finally | ||
| { | ||
| //always stop activity if it was started | ||
| if (activity != null) | ||
| { | ||
| activity.SetEndTime(DateTime.UtcNow); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the 16ms resolution of UtcNow a concern here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CC: @pharring who just looked into this issue in AI SDK
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. UtcNow is precise now: dotnet/coreclr#9736
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fantastic, thx. Ship it!
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As already said in #16491, I don't like that users have to call SetEndTime but not SetStartTime - especially because Why not fix the Activity API (the StopActivity part) in a way that uses DateTime.UtcNow for Start and Stop by default and also sends the duration to the DiagnosticSource listeners?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with you point, let's discuss it in #16491 and fix in another PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pharring - I suspect ASP.NET used both UtcNow and Stopwatch because up until about a week ago UtcNow was NOT very precise (16 msec). It is now upgraded to about 1usec, which should be good enough for HTTP request and response durations. That should be enough. |
||
| s_diagnosticListener.StopActivity(activity, new | ||
| { | ||
| s_diagnosticListener.Write( | ||
| DiagnosticsHandlerLoggingStrings.ResponseWriteName, | ||
| new | ||
| { | ||
| Response = t.Status == TaskStatus.RanToCompletion ? t.Result : null, | ||
| LoggingRequestId = (Guid) s, | ||
| TimeStamp = timestamp, | ||
| RequestTaskStatus = t.Status | ||
| } | ||
| ); | ||
| } | ||
| }, loggingRequestId, CancellationToken.None, TaskContinuationOptions.None, TaskScheduler.Default); | ||
| Response = responseTask.Status == TaskStatus.RanToCompletion ? responseTask.Result : null, | ||
| RequestTaskStatus = responseTask.Status | ||
| }); | ||
| } | ||
| //if Activity events are disabled, try to write System.Net.Http.Response event (deprecated) | ||
| else if (s_diagnosticListener.IsEnabled(DiagnosticsHandlerLoggingStrings.ResponseWriteNameDeprecated)) | ||
| { | ||
| long timestamp = Stopwatch.GetTimestamp(); | ||
| s_diagnosticListener.Write(DiagnosticsHandlerLoggingStrings.ResponseWriteNameDeprecated, | ||
| new | ||
| { | ||
| Response = responseTask.Status == TaskStatus.RanToCompletion ? responseTask.Result : null, | ||
| LoggingRequestId = loggingRequestId, | ||
| TimeStamp = timestamp, | ||
| RequestTaskStatus = responseTask.Status | ||
| } | ||
| ); | ||
| } | ||
| } | ||
| return responseTask.Result; | ||
| } | ||
|
|
||
| #region private | ||
|
|
||
| private static readonly DiagnosticListener s_diagnosticListener = | ||
| new DiagnosticListener(DiagnosticsHandlerLoggingStrings.DiagnosticListenerName); | ||
|
|
||
| #endregion | ||
| } | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require some encoding? What if a baggage value contains a ","? Are there any other characters that are not allowed as header values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. We are trying to keep the efficiency high and the complexity low, and frankly having arbitrary strings in Baggage is not really an interesting scenario (they are control values, not arbitrary strings). This argues that we push the problem to the clients of Baggage (either fail if you try to set what we will define as illegal baggage values (and possibly give them an API that does the escaping). Thus the change will be in System.Diagnostics.Activity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at NameValueHeaderValue (I didn't know about this class before) and it seems like it is checked there. It will throw an exception if I understood it correctly.