-
Notifications
You must be signed in to change notification settings - Fork 1.9k
chore(vrl): use new value crate for type checking in VRL
#11296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(vrl): use new value crate for type checking in VRL
#11296
Conversation
|
✔️ Deploy Preview for vector-project canceled. 🔨 Explore the source changes: 49d6618 🔍 Inspect the deploy log: https://app.netlify.com/sites/vector-project/deploys/6204e5aabeec8a0007ef510f |
|
I'm still investigating the root-cause of this breaking change. I've got this minimal reproducible example to pass the tests on this branch, now I'm seeing if it would fail on master: On first sight, this should not be allowed to compile, since we cannot be sure that |
|
Verified that this "passes" on master, but then subsequently fails at runtime: |
|
I'm not sure what changes I made triggered this to be resolved, so I'll have to dig into that, to know the extent to which this addresses an issue in VRL. It might be limited in scope to the |
|
Actually, I now realize this is not a problem in master, but is a problem in my branch. I forgot that a fallible assignment does not set the I'll investigate further why this regression was introduced, and why it wasn't caught in any of the existing test cases. |
|
Verified that the issue has to do with the In master, the type definition for Debugging further to find the actual root cause of the type def returning |
|
Long story short, this happened because the new This enum was always something I regretted introducing because (1) it made for an awkward API to use, and (2) it gave a false sense of type-safety. Specifically, "unknown" was basically the same as "known, but this value can resolve to all types", and there was nothing in the API stopping from expressing an unknown state as the latter, but the internals would behave differently whether they got an The new API does away with this distinction, the type state is merely a sum of all potential types a value can be, and what was previously "unknown" is now merely "known, and all type states can apply". The hidden bug here had to do with the This was one of the places you could cause unintended behaviour because if you merged two known type definitions, but one was in the "known, but all states apply" state, then it wouldn't discard that one, but would do an actual merge of the two type definitions, resulting in a different outcome. The new I went over all places in the code that used As for this not being caught in tests. The problem here is that we don't have a sufficiently good unit-testing framework to test type definition outcomes for individual expressions. All of our tests rely on inspecting the runtime outcome. The one part that does surface this, is our VRL test harness (in I'd love for there to be a better way to test this, but there currently isn't. As for the |
Signed-off-by: Jean Mertz <[email protected]>
Soak Test ResultsBaseline: 1841505 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes with confidence ≥ 90.00%: Fine details of change detection per experiment.
Fine details of each soak run.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
We don't expect this to land in 0.20.0, do we? Given that this PR contains significant complexity and no user-facing change, I'd like to use the next release cycle to iron out any potential issues left.
If we merge, I'd like to make sure that 0.20.0 is cut before this commit. I'm surprised this hasn't been the case already.
| // NOTE: technically we shouldn't do this, but to keep backward compatibility with the | ||
| // "old" `Kind` implementation, we consider a type that is marked as "any" equal to the old | ||
| // implementation's `unknown`, which, when merging, would discard that type. | ||
| // | ||
| // see: https://github.com/vectordotdev/vector/blob/18415050b60b08197e8135b7659390256995e844/lib/vrl/compiler/src/type_def.rs#L428-L429 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me which "right" way you envision here or does this comment not imply that we should change any of this logic in the future?
If we intend to change it, I'd suggest opening an issue with the proposed logic and a migration plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll likely introduce a new merge Strategy option that allows ignoring any values. It's not "bad" per sé, but it is confusing behaviour without explicitly requesting it as the caller of a type definition merge.
For now, that strategy doesn't exist in the value crate, so it lives here in VRL, I'll think on what the best next steps would be to either remove the need for this entirely, or move it to a more "official" place as a strategy in the value crate.
Reverts #11287