Skip to content

More refined categorization of tracing events? #113717

@RudolfKurkaMs

Description

@RudolfKurkaMs

Hello, we're from the PNH team (push notification api behind Teams/Skype etc).
We recently implemented custom .net-trace-messages collector via the EventListener class to allow us to diagnose some reoccurring live-site networking problems. This means listening to, collecting and storing of events from various trace sources like:

  • Microsoft.AspNetCore.Hosting
  • System.Net.Http
  • Private.InternalDiagnostics.System.Net.Http
  • Private.InternalDiagnostics.System.Net.Security
  • Private.InternalDiagnostics.System.Net.Sockets
  • System.Net.Security
  • System.Net.Sockets
  • System.Net.NameResolution

The information contained in these traces actually proved very valuable to us but we face some obstacles with the system in general and I was hoping you could share some insight on this.

We're struggling to prevent PII leaks into our logs, for example some of the traces seem to willingly write up all outgoing request headers (including response headers) etc. It happens often that part of outgoing request headers are things like API keys or usernames or IDs that are considered PII and need to be scrubbed explicitly.

In the end we ended up doing it the hard way and we implemented regex parsing system for certain events to extract some information and discard the rest and also various allow/deny lists based on keywords etc. This state of affairs is unfortunate as it limits the potential here and it makes the data not easily consumable programmatically. This probably can't be helped much but I wonder, would it at least be possible to categorize the events in some more meaningful way so that consumers of these traces can better filter out the content?

What I mean is that for example the Private.InternalDiagnostics.System.Net.Http event source seems to be logging most of its "good stuff" under one banner, ie. under event name of HandlerMessage and event id of 8. This groups together many divergent messages that get emitted through the various internal stages during a HTTP request (especially when it's HTTP 2.0). These may or may not contain PII in various places and only truly safe solution is either parse everything via regex (unfeasible due to the message variance) or simply discard the entire event source (we lose valuable telemetry).

Would it make sense to mark these traces with some granular enough data so that consumers are able to better (programmatically) tell different flows apart?

For example could the event id's be made to be more variant? The mentioned event source prints out same event-name and event-id when various members are invoked like CopyFromBufferAsync, or SendAsync (we have to handle this one separately via regex since it contains plaint text headers, which means we also need to be able to tell it apart from the rest of the events), or perhaps HTTP2Connection's WriteIndexedHeader which also prints out header value explicitely. All these vastly different events have the same event id (8) and event name (HandlerMessage).

Alternatively, could you recommend some better way to programmatically consume .NET traces so that it's easier to sanitize and store them in databases? Maybe something we missed?

Metadata

Metadata

Assignees

Labels

area-System.Net.HttpenhancementProduct code improvement that does NOT require public API changes/additions

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions