-
Notifications
You must be signed in to change notification settings - Fork 4k
Labels
A-kvAnything in KV that doesn't belong in a more specific category.Anything in KV that doesn't belong in a more specific category.A-kv-rangefeedRangefeed infrastructure, server+clientRangefeed infrastructure, server+clientC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.O-supportWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsO-testclusterIssues found or occurred on a test cluster, i.e. a long-running internal clusterIssues found or occurred on a test cluster, i.e. a long-running internal clusterP-1Issues/test failures with a fix SLA of 1 monthIssues/test failures with a fix SLA of 1 monthT-kvKV TeamKV Teambranch-masterFailures and bugs on the master branch.Failures and bugs on the master branch.target-release-26.1.0
Description
On the drt-large cluster, node 12 was killed due to an Out of Memory (OOM) error. Prior to the OOM event, there was a noticeable increase in cgo memory usage. At the time of this event, 100 changefeeds were running and experiencing lag.

Memory and cpu profiles attached around the time of OOM.
memory-analysis.zip
Related discussion in slack thread.
Jira issue: CRDB-43801
Metadata
Metadata
Assignees
Labels
A-kvAnything in KV that doesn't belong in a more specific category.Anything in KV that doesn't belong in a more specific category.A-kv-rangefeedRangefeed infrastructure, server+clientRangefeed infrastructure, server+clientC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.O-supportWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsO-testclusterIssues found or occurred on a test cluster, i.e. a long-running internal clusterIssues found or occurred on a test cluster, i.e. a long-running internal clusterP-1Issues/test failures with a fix SLA of 1 monthIssues/test failures with a fix SLA of 1 monthT-kvKV TeamKV Teambranch-masterFailures and bugs on the master branch.Failures and bugs on the master branch.target-release-26.1.0