-
Notifications
You must be signed in to change notification settings - Fork 104
feat(spans): initial MongoDB description scrubbing support #3912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
a544c08
e9ccc4b
b6c92b6
3bb717b
5b86ea4
bec7457
b0ddc97
4174456
7fe3116
582d6f8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -99,6 +99,13 @@ pub enum Feature { | |||||||||||||||||||||||||||||||||||||||||||
| #[serde(rename = "organizations:indexed-spans-extraction")] | ||||||||||||||||||||||||||||||||||||||||||||
| ExtractSpansFromEvent, | ||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||
| /// Enables description scrubbing for MongoDB spans (and consequently, their presence in the | ||||||||||||||||||||||||||||||||||||||||||||
| /// Queries module inside Sentry). | ||||||||||||||||||||||||||||||||||||||||||||
| /// | ||||||||||||||||||||||||||||||||||||||||||||
| /// Serialized as `organizations:performance-queries-mongodb-extraction`. | ||||||||||||||||||||||||||||||||||||||||||||
| #[serde(rename = "organizations:performance-queries-mongodb-extraction")] | ||||||||||||||||||||||||||||||||||||||||||||
| ScrubMongoDBDescriptions, | ||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||
| const DISABLED_DATABASES: &[&str] = &[ | |
| "*clickhouse*", | |
| "*compile*", | |
| "*mongodb*", | |
| "*redis*", | |
| "db.orm", | |
| ]; |
relay/relay-dynamic-config/src/defaults.rs
Lines 123 to 126 in 70c15da
| let is_db = RuleCondition::eq("span.sentry_tags.category", "db") | |
| & !(RuleCondition::eq("span.system", "mongodb") | |
| | RuleCondition::glob("span.op", DISABLED_DATABASES) | |
| | RuleCondition::glob("span.description", MONGODB_QUERIES)); |
is_db is used to control the presence of certain metrics and tags in hardcoded_span_metrics.
This PR removes that check, which means we will produce more metrics. However, note that:
- a number of SDKs (including v8 of the JS SDK) don't actually have
mongodbas part of the spanop, and were never subject to this check - high cardinality of the resulting metrics comes from the
span.descriptiontag, which is protected against separately (see following)
The second existing special case is in the span description scrubbing.
relay/relay-event-normalization/src/normalize/span/description/mod.rs
Lines 63 to 70 in 70c15da
| ("db", sub) => { | |
| if sub.contains("clickhouse") | |
| || sub.contains("mongodb") | |
| || sub.contains("redis") | |
| || is_legacy_activerecord(sub, db_system) | |
| || is_sql_mongodb(description, db_system) | |
| { | |
| None |
We check again for mongodb in the op (which as I mentioned, is often already bypassed), but the is_sql_mongodb function catches them all at this point by checking for a db.system of mongodb or JSON-looking description. This is the handling we fall through to when the feature is off.
So:
- This PR will change the situation from "some MongoDB spans are counted as
is_db" to "all MongoDB spans are counted asis_db", but - There will be no increase in cardinality, as all MongoDB span descriptions will continue to be
Nonein cases where the flag is not set.
In an ideal world, I would have kept the behaviour exactly the same and only conditionally changed DISABLED_DATABASES. But, looking at how hardcoded_span_metrics, I can't see any way to get feature flags down to it. So I reasoned that the outcomes above were probably acceptable, but I could definitely be wrong! Please let me know what you think. Thanks so much 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a number of SDKs (including v8 of the JS SDK) don't actually have mongodb as part of the span op, and were never subject to this check
I'd be worried about the Otel integrations here, not the SDKs necessarily.
high cardinality of the resulting metrics comes from the span.description tag, which is protected against separately (see following)
If we have some confidence in the scrubbing (which seems pretty sound to me, we can even start with a lower 'recursion limit'), I'd not use the feature flag at all. You will only find the outliers after it's too late anyways. And the feature flag just adds more conditionals and boilerplate while only toggling scrubbing not the full extraction.
That's the same approach that was taken for Redis.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| ScrubMongoDBDescriptions, | |
| ScrubMongoDbDescriptions, |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,6 +30,7 @@ use uuid::Uuid; | |
|
|
||
| use crate::normalize::request; | ||
| use crate::span::ai::normalize_ai_measurements; | ||
| use crate::span::description::ScrubMongoDescription; | ||
| use crate::span::tag_extraction::extract_span_tags_from_event; | ||
| use crate::utils::{self, get_event_user_tag, MAX_DURATION_MOBILE_MS}; | ||
| use crate::{ | ||
|
|
@@ -158,6 +159,9 @@ pub struct NormalizationConfig<'a> { | |
|
|
||
| /// Controls list of hosts to be excluded from scrubbing | ||
| pub span_allowed_hosts: &'a [Host], | ||
|
|
||
| /// Controls whether or not MongoDB span descriptions will be scrubbed. | ||
| pub scrub_mongo_description: ScrubMongoDescription, | ||
| } | ||
|
|
||
| impl<'a> Default for NormalizationConfig<'a> { | ||
|
|
@@ -190,6 +194,7 @@ impl<'a> Default for NormalizationConfig<'a> { | |
| normalize_spans: true, | ||
| replay_id: Default::default(), | ||
| span_allowed_hosts: Default::default(), | ||
| scrub_mongo_description: ScrubMongoDescription::Disabled, | ||
| } | ||
| } | ||
| } | ||
|
|
@@ -332,6 +337,7 @@ fn normalize(event: &mut Event, meta: &mut Meta, config: &NormalizationConfig) { | |
| event, | ||
| config.max_tag_value_length, | ||
| config.span_allowed_hosts, | ||
| config.scrub_mongo_description.clone(), | ||
|
||
| ); | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while to parse this condition, might benefit from a comment along the lines of "we disallow mongodb, unless
span.systemis set tomongodb".