-
Notifications
You must be signed in to change notification settings - Fork 4k
serverutils: probabilistic shared process virtual cluster #112822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
serverutils: probabilistic shared process virtual cluster #112822
Conversation
d2cddbd to
3181940
Compare
2c67f9a to
53ceef5
Compare
ae31890 to
8796cd9
Compare
b5b461b to
a03af6d
Compare
a03af6d to
30f8feb
Compare
herkolategan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @DarrylWong, @dhartunian, @jayshrivastava, @srosenberg, and @yuzefovich)
pkg/server/testserver.go line 743 at r3 (raw file):
Previously, herkolategan (Herko Lategan) wrote…
I agree and I had the same thought, my only rotational was that I didn't fully comprehend the differences in capabilities between external and shared. But it's clear now that it's the same, except for I think
can debug processworks on external without explicitly setting it here. Busy merging these two, as well as taking the comment around selective capabilities into account.
Merged the two, but there is still a bit of work left I intend to complete in a follow-up PR: #107695, #113187
pkg/server/testserver.go line 763 at r3 (raw file):
Previously, herkolategan (Herko Lategan) wrote…
Will be done in addition to removing the duplication of process capabilities. I'll just ask for a second look once I'm done.
I've added the check you have mentioned as well.
yuzefovich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few more comments, but it It might still be worth it to wait for @stevendanna to take a look at least at the testserver changes.
Reviewed 48 of 48 files at r6, 4 of 4 files at r7, 1 of 1 files at r8, 25 of 25 files at r9, 21 of 21 files at r10, 1 of 1 files at r11, 2 of 2 files at r12, 2 of 2 files at r13, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @DarrylWong, @dhartunian, @herkolategan, @jayshrivastava, and @srosenberg)
pkg/server/testserver.go line 762 at r12 (raw file):
} // Waiting for capabilities can time To avoid paying this cost in all
nit: s/can time/can time./.
pkg/testutils/serverutils/test_server_shim.go line 105 at r6 (raw file):
shared = false default: shared = util.ConstantWithMetamorphicTestBoolWithoutLogging("test-tenant-shared-process", false)
nit: what's the rationale for using metamorphic helper here? I'd have expected that here we'd give each mode 50% probability, and the current way will pick the external process more frequently.
pkg/testutils/serverutils/test_server_shim.go line 115 at r6 (raw file):
// Obey the env override if present. if str, present := envutil.EnvString("COCKROACH_TEST_TENANT", 0); present {
It might be nice to have a way to force the default test tenant with a particular process mode. We could either adjust the existing env var to take in a string or something, or introduce another bool env var that controls whether the shared process is picked (I'd probably prefer the latter option slightly), or perhaps in some other fashion. This doesn't have to be in this PR, but please leave a TODO for this if it's not done.
pkg/sql/tests/show_commit_timestamp_test.go line 80 at r9 (raw file):
testutils.RunTrueAndFalse(t, "pgx batch; simple", func(t *testing.T, simple bool) { resetTable(t) pgURL, cleanup := s.PGUrl(t)
nit: add .ApplicationLayer() here and below.
|
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
herkolategan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @DarrylWong, @jayshrivastava, @srosenberg, and @yuzefovich)
pkg/server/testserver.go line 762 at r12 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/can time/can time./.
done.
pkg/testutils/serverutils/test_server_shim.go line 105 at r6 (raw file):
metamorphicBuildProbability = 0.8
Ah right, metamorphic eligible builds don't always run metamorphic, so the 50% bool selection is slanted ~40% for the default case.
I'll update this to use the randutil.
pkg/testutils/serverutils/test_server_shim.go line 115 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It might be nice to have a way to force the default test tenant with a particular process mode. We could either adjust the existing env var to take in a string or something, or introduce another bool env var that controls whether the shared process is picked (I'd probably prefer the latter option slightly), or perhaps in some other fashion. This doesn't have to be in this PR, but please leave a TODO for this if it's not done.
Good idea; adding a TODO.
pkg/sql/tests/show_commit_timestamp_test.go line 80 at r9 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: add
.ApplicationLayer()here and below.
good catch, done.
Previously probabilistic test behaviour args did not discern if a virtual cluster should be started as an external or shared process. This change updates `testBehaviour` to account for permutations of the various possible scenarios. It has been updated to a bitmask to allow probabilistic enablement and probabilistic process modes. Any bit groups not set are assumed to be probabilistic. Explicit modes will have both an enablement bit (`ttEnabled`, `ttDisabled`) and process mode (`ttSharedProcess`, `ttExternalProcess`) bit set. If the process mode bit is left unset it will be selected probabilistically in `ShouldStartDefaultTestTenant`. Additionally, functions have been added to disable probabilistically selecting a shared process virtual cluster for tests that do not work with a shared process virtual cluster yet. Epic: CRDB-31933 Release note: None
Previously some tests used `sqlutils.PGUrl` as a means to form a connection. This worked well before the introduction of probabilistically testing with shared virtual clusters. The URL for a shared process requires a cluster option to be passed in this case. In order to avoid branching logic in tests when a test virtual cluster is present, this change adds a `PGUrl` method to the `ApplicationLayerInterface` so that a decision can be made to add the cluster option to the URL without the test having to specify it. A new option is added to be able to specify the dir prefix for the certs directory as well. Epic: CRDB-31933 Release note: None
Previously `maybeStartDefaultTestTenant` was only able to start an external process virtual cluster. This change updates the method to be able to start a shared process virtual cluster if the test params specify it. The shared process test virtual cluster is granted default capabilities, similar to the external process test virtual cluster. Epic: CRDB-31933 Release note: None
Tests using `sqlutils.PGUrl` should be updated to rather use `PGUrl` or `SQLConn` on the `ApplicationLayerInterface` to take advantage of the URL automatically including the cluster option in the event the connection is to a shared process virtual cluster. Previously these tests would have failed if a shared process virtual cluster is probabilistically started, since the URL would not contain the required cluster option, when trying to connect. Epic: CRDB-31933 Release note: None
This change disables probabilistically selecting a shared process test virtual cluster for tests which fail when running against a shared process virtual cluster. See also: cockroachdb#112857 Epic: CRDB-26687 Release note: None
Previously a CCL license was required for the default test tenant to be started. This requirement no longer holds true, hence the comment has been updated. See also: cockroachdb#108762 Epic: CRDB-31933 Release note: None
Previously shared and external process tenants had separate sections for tenant capabilities and cluster settings set during initialization. This change combines the code to avoid duplication. Epic: CRDB-31933 Release note: None
…est tenant Disable tests which require specific capabilities for the default shared process test tenant. The tests disabled in this change specifically require the `can_debug_process` capability that is available with an external process tenant, but not with a shared process tenant. The shared process default test tenant can be re-enabled after the issue mentioned below is addressed. See also: cockroachdb#113187 Epic: CRDB-31933 Release note: None
Previously `StartSharedProcessTenant` did not have any default cluster settings or capabilities. It now does, hence we need to remove this expectation from the test as it uses the same logic to start a shared process virtual cluster. See also: cockroachdb#113187 Epic: CRDB-31933 Release note: None
Use `randutil` to select a process mode probabilistically instead of a metamorphic constant, as the latter will not yield an even selection when the test virtual cluster is enabled. Additionally, a comment has been added to note that an env var should be added to be able to override the process mode if required. See also: cockroachdb#113294 Epic: CRDB-31933 Release note: None
Epic: CRDB-31933 Release note: None
Epic: CRDB-31933 Release note: None
1e4a21a to
f765976
Compare
7dfdbe0 to
f765976
Compare
stevendanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like where this landed. Thanks for pushing this forward and working to integrate it nicely into the existing test-server setup.
|
TFTRs! bors r=yuzefovich,stevendanna |
|
Build succeeded: |
This PR contain several commits that ultimately enables the probabilistic selection of a shared process test virtual cluster that tests run against, iff the test also probabilistically decided to run with a test tenant. Previously only an external process virtual cluster would be available as an option when deciding to run with a test virtual cluster.
Additionally some API changes have been made to serverutils to encourage the use of the test interface layers when working with a test server or cluster. Tests that use other utility methods to connect or create connection URLs may not be aware of a test virtual cluster and lack required information when interacting with the cluster resulting in flakiness and failures.
Finally this change also disables selecting a shared process virtual cluster instead of an external process virtual cluster for tests which fail with a shared process virtual cluster. An issue has been created to track these tests: #112857
Issues linked to this issue will contain more detail relating to the failure. And the issue has been referenced in the code itself when disabling a shared process virtual cluster for a specific test.
Resolves: #111134