Skip to content

Conversation

@esensar
Copy link
Contributor

@esensar esensar commented Oct 25, 2025

Summary

This prevents late messages from breaking expected utilization bounds, by clamping all late messages to the start of reporting period. Messages arrive in order, so that should not break the expected state (waiting or not waiting). Utilization metric can still be slightly inaccurate on late messages, but it was never meant to be completely accurate.

How did you test this PR?

Tested with the included unit tests.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

Sponsored by Quad9

This prevents late messages from breaking expected utilization bounds, by clamping all late messages
to the start of reporting period. Messages arrive in order, so that should not break the expected
state (waiting or not waiting). Utilization metric can still be slightly inaccurate on late
messages, but it was never meant to be completely accurate.

Fixes: vectordotdev#24060
@esensar esensar requested a review from a team as a code owner October 25, 2025 07:52
Comment on lines +129 to +130
let now = Instant::now();
self.end_span(now);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume I left these return values for start_wait, end_wait and end_span from some previous iteration of this. I removed them now for some clarity.

Comment on lines +159 to +161
// At can be before span start here, the result will be clamped to 0
// because `duration_since` returns zero if at is before span start
self.total_wait += at.duration_since(self.span_start);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as using Sub, but I thought this might be more clear about the clamping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vector utilization metric emits negative values after upgrade to v0.49.0

1 participant