-
-
Couldn't load subscription status.
- Fork 225
WIP feat(AI): Microsoft.Extensions.AI instrumentation
#4657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
WIP feat(AI): Microsoft.Extensions.AI instrumentation
#4657
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4657 +/- ##
==========================================
+ Coverage 73.49% 73.52% +0.03%
==========================================
Files 483 488 +5
Lines 17692 17890 +198
Branches 3492 3545 +53
==========================================
+ Hits 13002 13154 +152
- Misses 3799 3829 +30
- Partials 891 907 +16 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@sentry review |
|
@sentry review |
|
@BugBot review |
|
@BugBot review |
|
@sentry review |
|
@BugBot review |
|
@sentry review |
|
@BugBot review |
| CancellationToken cancellationToken = new()) | ||
| { | ||
| var chatMessages = messages as ChatMessage[] ?? messages.ToArray(); | ||
| var keyMessage = chatMessages[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this would be fine because literally every single overload for GetResponseAsync either takes in a ChatMessage or creates one. Also it doesn't really make sense to send a GetResponseAsync without any messages.
But after thinking about it, I should actually add the check because who know what user code will be doing..
| [EnumeratorCancellation] CancellationToken cancellationToken = new()) | ||
| { | ||
| var chatMessages = messages as ChatMessage[] ?? messages.ToArray(); | ||
| var keyMessage = chatMessages[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private static ISpan CreateChatSpan(ISpan outerSpan, ChatOptions? options) | ||
| { | ||
| const string chatOperation = "gen_ai.chat"; | ||
| var chatSpanName = options is null || string.IsNullOrEmpty(options.ModelId) | ||
| ? "chat unknown model" | ||
| : $"chat {options.ModelId}"; | ||
| return outerSpan.StartChild(chatOperation, chatSpanName); | ||
| } | ||
|
|
||
| internal static ConcurrentDictionary<ChatMessage, ISpan> GetMessageToSpanDict(ChatOptions? options = null) | ||
| { | ||
| if (options?.AdditionalProperties?.TryGetValue<ConcurrentDictionary<ChatMessage, ISpan>>( | ||
| SentryAIConstants.OptionsAdditionalAttributeAgentSpanName, out var agentSpanDict) == true) | ||
| { | ||
| return agentSpanDict; | ||
| } | ||
|
|
||
| // If we couldn't find the dictionary, we just initiate it now | ||
| agentSpanDict = new ConcurrentDictionary<ChatMessage, ISpan>(); | ||
| if (options == null) | ||
| { | ||
| return agentSpanDict; | ||
| } | ||
|
|
||
| options.AdditionalProperties ??= new AdditionalPropertiesDictionary(); | ||
| options.AdditionalProperties.TryAdd(SentryAIConstants.OptionsAdditionalAttributeAgentSpanName, agentSpanDict); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using ChatMessage objects as ConcurrentDictionary keys relies on reference equality. This is fragile because if a ChatMessage is cloned, serialized/deserialized, or recreated elsewhere in the call stack, the dictionary lookup will fail silently, causing spans not to be properly tracked. Consider using a stable identifier (like a unique ID or message index) instead of object reference equality. Alternatively, document this assumption prominently and add logging to detect when expected spans are not found.
Severity: HIGH
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/Sentry.Extensions.AI/SentryChatClient.cs#L155-L180
Potential issue: Using ChatMessage objects as ConcurrentDictionary keys relies on
reference equality. This is fragile because if a ChatMessage is cloned,
serialized/deserialized, or recreated elsewhere in the call stack, the dictionary lookup
will fail silently, causing spans not to be properly tracked. Consider using a stable
identifier (like a unique ID or message index) instead of object reference equality.
Alternatively, document this assumption prominently and add logging to detect when
expected spans are not found.
Did we get this right? 👍 / 👎 to inform future reviews.
| FormatAsJson(tools, tool => new { name = tool.Name, description = tool.Description }); | ||
|
|
||
| private static string FormatRequestMessage(ChatMessage[] messages) => | ||
| FormatAsJson(messages, message => new { role = message.Role, content = message.Text }); | ||
|
|
||
| private static string FormatAsJson<T>(IEnumerable<T> items, Func<T, object> selector) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FormatAsJson method calls JsonSerializer.Serialize without error handling. If serialization fails for any reason (circular references, unsupported types, etc.), the exception will propagate and potentially fail the entire chat operation. Add try-catch around serialization calls to log failures and gracefully fall back to null or a simplified representation. This prevents instrumentation code from breaking business logic.
Severity: HIGH
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/Sentry.Extensions.AI/SentryAISpanEnricher.cs#L151-L156
Potential issue: The `FormatAsJson` method calls `JsonSerializer.Serialize` without
error handling. If serialization fails for any reason (circular references, unsupported
types, etc.), the exception will propagate and potentially fail the entire chat
operation. Add try-catch around serialization calls to log failures and gracefully fall
back to null or a simplified representation. This prevents instrumentation code from
breaking business logic.
Did we get this right? 👍 / 👎 to inform future reviews.
|
|
||
| internal static ConcurrentDictionary<ChatMessage, ISpan> GetMessageToSpanDict(ChatOptions? options = null) | ||
| { | ||
| if (options?.AdditionalProperties?.TryGetValue<ConcurrentDictionary<ChatMessage, ISpan>>( | ||
| SentryAIConstants.OptionsAdditionalAttributeAgentSpanName, out var agentSpanDict) == true) | ||
| { | ||
| return agentSpanDict; | ||
| } | ||
|
|
||
| // If we couldn't find the dictionary, we just initiate it now | ||
| agentSpanDict = new ConcurrentDictionary<ChatMessage, ISpan>(); | ||
| if (options == null) | ||
| { | ||
| return agentSpanDict; | ||
| } | ||
|
|
||
| options.AdditionalProperties ??= new AdditionalPropertiesDictionary(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GetMessageToSpanDict method has a subtle pattern: it checks for an existing dictionary, and if not found, creates a new one. While the TryAdd is thread-safe, the overall flow could benefit from a comment explaining that this dictionary persists across multiple GetResponseAsync/GetStreamingResponseAsync calls for the same ChatMessage to support tool-calling loops. Document why the dictionary is stored in ChatOptions.AdditionalProperties instead of using a class-level concurrent collection.
Severity: MEDIUM
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/Sentry.Extensions.AI/SentryChatClient.cs#L163-L179
Potential issue: The `GetMessageToSpanDict` method has a subtle pattern: it checks for
an existing dictionary, and if not found, creates a new one. While the `TryAdd` is
thread-safe, the overall flow could benefit from a comment explaining that this
dictionary persists across multiple `GetResponseAsync`/`GetStreamingResponseAsync` calls
for the same ChatMessage to support tool-calling loops. Document why the dictionary is
stored in `ChatOptions.AdditionalProperties` instead of using a class-level concurrent
collection.
Did we get this right? 👍 / 👎 to inform future reviews.
| { | ||
| arguments.Remove(SentryAIConstants.KeyMessageFunctionArgumentDictKey); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RemoveSentryArgs method modifies the caller's AIFunctionArguments dictionary in-place by removing the Sentry-injected key. While necessary to hide implementation details, this mutates the arguments after the user has populated them but before the actual function call. Add a comment explaining why this removal is necessary and why it's safe to modify arguments at this point. Consider whether this could cause issues if the user inspects their arguments dictionary.
Severity: MEDIUM
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/Sentry.Extensions.AI/SentryInstrumentedFunction.cs#L65-L67
Potential issue: The `RemoveSentryArgs` method modifies the caller's
`AIFunctionArguments` dictionary in-place by removing the Sentry-injected key. While
necessary to hide implementation details, this mutates the arguments after the user has
populated them but before the actual function call. Add a comment explaining why this
removal is necessary and why it's safe to modify arguments at this point. Consider
whether this could cause issues if the user inspects their arguments dictionary.
Did we get this right? 👍 / 👎 to inform future reviews.
| /// We create an entry in _spans concurrent dictionary to keep track of | ||
| /// what root span to use in consequent calls of <see cref="GetResponseAsync"/> or <see cref="GetStreamingResponseAsync"/> | ||
| /// </summary> | ||
| /// <param name="message"></param> | ||
| /// <param name="options"></param> | ||
| /// <returns></returns> | ||
| private ISpan CreateOrGetRootSpan(ChatMessage message, ChatOptions? options) | ||
| { | ||
| var spanDict = GetMessageToSpanDict(options); | ||
| if (!spanDict.TryGetValue(message, out var rootSpan)) | ||
| { | ||
| var invokeSpanName = $"invoke_agent {InnerClient.GetType().Name}"; | ||
| const string invokeOperation = "gen_ai.invoke_agent"; | ||
| rootSpan = _hub.StartSpan(invokeOperation, invokeSpanName); | ||
| rootSpan.SetData("gen_ai.agent.name", $"{InnerClient.GetType().Name}"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The span lifecycle management logic is complex: outer spans persist across multiple chat calls when FinishReason != Stop, while inner spans are finished immediately. Add more detailed XML documentation comments to the CreateOrGetRootSpan method and the overall span management strategy to help future maintainers understand why spans are created/finished at these specific points. This complexity is error-prone if modifications are made.
Severity: MEDIUM
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/Sentry.Extensions.AI/SentryChatClient.cs#L135-L150
Potential issue: The span lifecycle management logic is complex: outer spans persist
across multiple chat calls when `FinishReason != Stop`, while inner spans are finished
immediately. Add more detailed XML documentation comments to the `CreateOrGetRootSpan`
method and the overall span management strategy to help future maintainers understand
why spans are created/finished at these specific points. This complexity is error-prone
if modifications are made.
Did we get this right? 👍 / 👎 to inform future reviews.
| #nullable enable | ||
| using Microsoft.Extensions.AI; | ||
|
|
||
| namespace Sentry.Extensions.AI.Tests; | ||
|
|
||
| public class SentryChatClientTests | ||
| { | ||
| [Fact] | ||
| public async Task CompleteAsync_CallsInnerClient() | ||
| { | ||
| var inner = Substitute.For<IChatClient>(); | ||
| var message = new ChatMessage(ChatRole.Assistant, "ok"); | ||
| var chatResponse = new ChatResponse(message); | ||
| inner.GetResponseAsync(Arg.Any<IList<ChatMessage>>(), Arg.Any<ChatOptions>(), Arg.Any<CancellationToken>()) | ||
| .Returns(Task.FromResult(chatResponse)); | ||
|
|
||
| var sentryChatClient = new SentryChatClient(inner); | ||
|
|
||
| var res = await sentryChatClient.GetResponseAsync([new ChatMessage(ChatRole.User, "hi")], null); | ||
|
|
||
| Assert.Equal([message], res.Messages); | ||
| await inner.Received(1).GetResponseAsync(Arg.Any<IList<ChatMessage>>(), Arg.Any<ChatOptions>(), | ||
| Arg.Any<CancellationToken>()); | ||
| } | ||
|
|
||
| [Fact] | ||
| public async Task CompleteStreamingAsync_CallsInnerClient() | ||
| { | ||
| var inner = Substitute.For<IChatClient>(); | ||
|
|
||
| inner.GetStreamingResponseAsync(Arg.Any<IList<ChatMessage>>(), Arg.Any<ChatOptions>(), | ||
| Arg.Any<CancellationToken>()) | ||
| .Returns(CreateTestStreamingUpdatesAsync()); | ||
|
|
||
| var client = new SentryChatClient(inner); | ||
|
|
||
| var results = new List<ChatResponseUpdate>(); | ||
| await foreach (var update in client.GetStreamingResponseAsync([new ChatMessage(ChatRole.User, "hi")], null)) | ||
| { | ||
| results.Add(update); | ||
| } | ||
|
|
||
| Assert.Equal(2, results.Count); | ||
| Assert.Equal("Hello", results[0].Text); | ||
| Assert.Equal(" World!", results[1].Text); | ||
|
|
||
| inner.Received(1).GetStreamingResponseAsync(Arg.Any<IList<ChatMessage>>(), Arg.Any<ChatOptions>(), | ||
| Arg.Any<CancellationToken>()); | ||
| } | ||
|
|
||
| private static async IAsyncEnumerable<ChatResponseUpdate> CreateTestStreamingUpdatesAsync() | ||
| { | ||
| yield return new ChatResponseUpdate(ChatRole.System, "Hello"); | ||
| await Task.Yield(); // Make it async | ||
| yield return new ChatResponseUpdate(ChatRole.System, " World!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests only verify that the inner client is called, but don't validate that spans are created, enriched, and finished correctly. Add integration tests that verify: (1) span names match expected patterns, (2) request/response data is properly enriched, (3) spans are finished with correct status, (4) exception handling properly finishes spans with error status. These tests would catch regressions in the core instrumentation logic.
Severity: MEDIUM
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: test/Sentry.Extensions.AI.Tests/SentryChatClientTests.cs#L1-L55
Potential issue: The tests only verify that the inner client is called, but don't
validate that spans are created, enriched, and finished correctly. Add integration tests
that verify: (1) span names match expected patterns, (2) request/response data is
properly enriched, (3) spans are finished with correct status, (4) exception handling
properly finishes spans with error status. These tests would catch regressions in the
core instrumentation logic.
Did we get this right? 👍 / 👎 to inform future reviews.
WIP!!
For now, here's a quick screenshot of what it looks like right now