-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat(testing): Add nondeterministic verifier retry #26696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideAdds a new verifier config flag and behavior to automatically resubmit nondeterministic query verifications instead of marking them as skipped, plus tests and config wiring. Sequence diagram for nondeterministic verification resubmission flowsequenceDiagram
actor VerifierUser
participant VerifierCli
participant VerificationManager
participant AbstractVerification
participant DeterminismAnalyzer
VerifierUser->>VerifierCli: run_verifier_with_optional_flag(resubmit_nondeterministic_queries)
VerifierCli->>VerificationManager: start_verification_jobs()
VerificationManager->>AbstractVerification: verify()
AbstractVerification->>AbstractVerification: run_control_and_test_queries()
AbstractVerification->>AbstractVerification: compare_results()
AbstractVerification-->>VerificationManager: MatchResult mismatch_possibly_caused_by_nondeterminism
VerificationManager->>AbstractVerification: analyze_determinism()
AbstractVerification->>DeterminismAnalyzer: analyzeDeterminism(control_query_result, test_query_result, match_result)
DeterminismAnalyzer-->>AbstractVerification: DeterminismAnalysisDetails
alt resubmitNondeterministicQueries_enabled_and_limit_not_reached
AbstractVerification->>AbstractVerification: check resubmitNondeterministicQueries
AbstractVerification->>AbstractVerification: check verificationContext.getResubmissionCount() < verificationResubmissionLimit
AbstractVerification-->>VerificationManager: VerificationResult(resubmissionRequested = true)
VerificationManager->>VerificationManager: increment_resubmission_count()
VerificationManager->>AbstractVerification: verify() %% resubmitted verification
else flag_disabled_or_limit_reached
AbstractVerification-->>VerificationManager: VerificationResult(resubmissionRequested = false)
VerificationManager-->>VerifierCli: report_nondeterministic_failure()
VerifierCli-->>VerifierUser: display_nondeterministic_failure()
end
Updated class diagram for verifier nondeterministic resubmission configurationclassDiagram
class VerifierConfig {
-boolean skipChecksum
-boolean runDeterminismAnalysisOnTest
-boolean concurrentControlAndTest
-boolean resubmitNondeterministicQueries
+boolean isSkipChecksum()
+boolean isRunDeterminismAnalysisOnTest()
+boolean isConcurrentControlAndTest()
+boolean isResubmitNondeterministicQueries()
+VerifierConfig setConcurrentControlAndTest(boolean concurrentControlAndTest)
+VerifierConfig setResubmitNondeterministicQueries(boolean resubmitNondeterministicQueries)
}
class AbstractVerification {
#boolean isExplain
#boolean isRunDeterminismAnalysisOnTest
-boolean concurrentControlAndTest
-boolean resubmitNondeterministicQueries
-int verificationResubmissionLimit
+AbstractVerification(VerifierConfig verifierConfig, QueryBundleFactory queryBundleFactory, VerificationContext verificationContext, int verificationResubmissionLimit)
+VerificationResult verify()
-DeterminismAnalysisDetails analyzeDeterminism(QueryBundle control, QueryBundle test, MatchResult matchResult)
}
class VerificationContext {
+int getResubmissionCount()
}
class VerificationResult {
+VerificationResult(AbstractVerification verification, boolean resubmissionRequested, Optional determinismAnalysisDetails)
}
class MatchResult {
+boolean isMismatchPossiblyCausedByNonDeterminism()
}
class DeterminismAnalysisDetails {
+DeterminismAnalysis getDeterminismAnalysis()
}
class DeterminismAnalysis {
+boolean isNonDeterministic()
}
class QueryBundleFactory
class QueryBundle
AbstractVerification --> VerifierConfig : reads_configuration
AbstractVerification --> VerificationContext : uses_resubmission_count
AbstractVerification --> VerificationResult : creates
AbstractVerification --> MatchResult : evaluates
AbstractVerification --> DeterminismAnalysisDetails : computes
DeterminismAnalysisDetails --> DeterminismAnalysis : contains
AbstractVerification --> QueryBundleFactory : creates_bundles
AbstractVerification --> QueryBundle : uses_in_verification
VerifierConfig o-- AbstractVerification : provides_flags_and_limits
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- In AbstractVerification, the boolean flag in
new VerificationResult(this, true, Optional.empty())is not self-explanatory in the context of a resubmission path; consider using a factory or named helper to make this code path’s intent (resubmission vs. normal success/failure) clearer to future readers. - The
@ConfigDescriptionforresubmit-nondeterministic-queriescould be improved by explicitly documenting its interaction withverificationResubmissionLimitso that operators understand when resubmissions will actually occur and when they will stop.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In AbstractVerification, the boolean flag in `new VerificationResult(this, true, Optional.empty())` is not self-explanatory in the context of a resubmission path; consider using a factory or named helper to make this code path’s intent (resubmission vs. normal success/failure) clearer to future readers.
- The `@ConfigDescription` for `resubmit-nondeterministic-queries` could be improved by explicitly documenting its interaction with `verificationResubmissionLimit` so that operators understand when resubmissions will actually occur and when they will stop.
## Individual Comments
### Comment 1
<location> `presto-verifier/src/test/java/com/facebook/presto/verifier/framework/TestDataVerification.java:275-284` </location>
<code_context>
assertDeterminismAnalysisRun(runs.get(0), true);
}
+ @Test
+ public void testNondeterministicResubmission()
+ {
+ VerificationSettings settingsWithFlag = new VerificationSettings();
+ settingsWithFlag.resubmitNondeterministicQueries = Optional.of(true);
+
+ Optional<VerifierQueryEvent> eventWithFlag = runVerification("SELECT rand()", "SELECT 2.0", settingsWithFlag);
+ assertFalse(eventWithFlag.isPresent());
+
+ Optional<VerifierQueryEvent> eventWithoutFlag = runVerification("SELECT rand()", "SELECT 2.0");
+ assertTrue(eventWithoutFlag.isPresent());
+ assertEquals(eventWithoutFlag.get().getStatus(), SKIPPED.name());
+ assertEquals(eventWithoutFlag.get().getSkippedReason(), NON_DETERMINISTIC.name());
+ }
+
</code_context>
<issue_to_address>
**issue (testing):** Add tests for resubmission limit and deterministic mismatch cases when resubmitNondeterministicQueries is enabled
Please also cover two important edge cases:
- With resubmitNondeterministicQueries = true but a deterministic mismatch, confirm we do not resubmit and still produce the appropriate SKIPPED/FAILED event.
- With resubmitNondeterministicQueries = true and a nondeterministic mismatch, but the resubmission count already at verificationResubmissionLimit, confirm we do not resubmit again and instead emit a SKIPPED/NON_DETERMINISTIC event.
A small focused test using a low resubmission limit (e.g., 0 or 1) that forces deterministic vs nondeterministic mismatches would validate the new branching in AbstractVerification across these cases.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| @Test | ||
| public void testNondeterministicResubmission() | ||
| { | ||
| VerificationSettings settingsWithFlag = new VerificationSettings(); | ||
| settingsWithFlag.resubmitNondeterministicQueries = Optional.of(true); | ||
|
|
||
| Optional<VerifierQueryEvent> eventWithFlag = runVerification("SELECT rand()", "SELECT 2.0", settingsWithFlag); | ||
| assertFalse(eventWithFlag.isPresent()); | ||
|
|
||
| Optional<VerifierQueryEvent> eventWithoutFlag = runVerification("SELECT rand()", "SELECT 2.0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (testing): Add tests for resubmission limit and deterministic mismatch cases when resubmitNondeterministicQueries is enabled
Please also cover two important edge cases:
- With resubmitNondeterministicQueries = true but a deterministic mismatch, confirm we do not resubmit and still produce the appropriate SKIPPED/FAILED event.
- With resubmitNondeterministicQueries = true and a nondeterministic mismatch, but the resubmission count already at verificationResubmissionLimit, confirm we do not resubmit again and instead emit a SKIPPED/NON_DETERMINISTIC event.
A small focused test using a low resubmission limit (e.g., 0 or 1) that forces deterministic vs nondeterministic mismatches would validate the new branching in AbstractVerification across these cases.
NikhilCollooru
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Summary: We have a verifier usecase where partitions may still be landing when the verifier is running. If the partition becomes available in between when the control and test query runs, the test is marked nondeterministic but still a failure. We actually aren't interested in this case and don't want to spend manual effort resolving it, we would rather re-run the test immediately in hopes that the nondeterminism goes away (e.g. that partitions don't continue to land). Adding a flag to that effect There is not currently a mechanism to have an inline retry, so in this diff I am adding an additional CLI option to resubmit queries if determinism analysis has proven them to be nondeterministic e.g., if fresh data has arrived and altered the results after the control query has run. Reviewed By: NikhilCollooru Differential Revision: D87831484
055cfb3 to
e763688
Compare
Description
Adding a verifier option to retry any queries which were shown to be nondeterministic during determinism analysis.
Motivation and Context
In some cases, fresh data may be landing on the source tables as the verifier is running. e.g., in our case, the source tables have a partition landing between when the control and test query is running. This is not a meaningful difference between source and test query, so we would like to retry those automatically and immediately without performing a retest.
Impact
Verifier runs can now automatically retry queries which failed due to transient nondeterminism.
Test Plan
Added test coverage to TestVerifierConfig + the E2E DataVerificationTest
Contributor checklist
Release Notes
Differential Revision: D87831484