- 
                Notifications
    You must be signed in to change notification settings 
- Fork 4k
changefeedccl: changefeed-specific option for num of sink workers #156260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this! Looks good, left a few comments.
| } else if clusterWorkers > 0 { | ||
| numWorkers = clusterWorkers | ||
| } else { | ||
| numWorkers = 0 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it would set numWorkers to 0, even if both the cluster setting and changefeed value were negative.
(Aside: Seems to me that we may not be currently respecting the negative value as documented which I can look at/file an issue for.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the current implementation does not adhere to the documentation:
"the number of workers used by changefeeds when sending requests to the sink "+
"(currently the batching versions of webhook, pubsub, and kafka sinks that are "+
"enabled by changefeed.new__sink_enabled only): <0 disables, 0 assigns "+
"a reasonable default, >0 assigns the setting value",
When a value <0 is provided, a default will be set, see current implementation:
numWorkers := changefeedbase.SinkIOWorkers.Get(&cfg.Settings.SV)
if numWorkers > 0 {
    return int(numWorkers)
}
idealNumber := runtime.GOMAXPROCS(0)
if idealNumber < 1 {
	return 1
}
if idealNumber > 32 {
	return 32
}
return idealNumberWould you like to tackle this as part of this PR or create a separate issue for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to micromanage here. But I think the max that you are now using here could use a comment. But for this case where one of changefeedWorkers and clusterWorkers is zero and one is negative or both are negative, it may become useful to have the local numWorkers variable maintain that information if the current implementation doesn't respect it. I'd like to see tests for these cases also e.g. what happens if one value is negative and the other positive,
P.S. Out of curiousity, are you using AI to generate this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to micromanage here.
This is not how interpret this or other comments - and after all, you're the code owner and will maintain this code. Let's make sure that it lives up to your standards.
But could you help me here on what you'd expect? My struggle with comments has been like a red thread through this PR, so I'd appreciate some guidance.
But for this case where one of changefeedWorkers and clusterWorkers is zero and one is negative or both are negative, it may become useful to have the local numWorkers variable maintain that information if the current implementation doesn't respect it.
I'd like to see tests for these cases also e.g. what happens if one value is negative and the other positive,
I added cases for
- one value being negative, the other positive -> asserting that the positive value is applied
- both values being negative -> asserting that a runtime.GOMAXPROCSbased default is applied
Let me know that is according to your expectations - I have a hunch that I'm not really grasping what your intention in the first paragraph of the quote is.
P.S. Out of curiousity, are you using AI to generate this PR?
I used Claude Code to suggest where changes would be needed and where e.g. test cases should be added as I'm largely unfamiliar with the codebase. I had it then add some scaffolding in the form of function definitions and filled in the rest.
Also out of curiosity: what made you ask? Did you react to anything in the code or our conversation that had a "this is AI" flair to it?
3a810a8    to
    7213958      
    Compare
  
    | Thank you for updating your pull request. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the swift review! I addressed the comments and amended my initial commit as described in the Submitting your contribution section of the contribution guidelines.
Let me know if you'd like to see any other changes.
| sqlDB.Exec(t, `CREATE TABLE foo (a INT PRIMARY KEY)`) | ||
| sqlDB.Exec(t, `INSERT INTO foo VALUES (1)`) | ||
|  | ||
| // Test 1: Per-changefeed = 5, cluster setting = 0 -> should use 5. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: These comments also don't adhere to the style guide. There are other comments in the test files (including inline comments) that also do not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the comments. As these tests are now implemented as subtests, the subtest names replaced the comments. Let me know if you'd like the subtest names to be changed.
| require.NoError(t, foo1.Close()) | ||
|  | ||
| // Test 2: Cluster setting = 10, per-changefeed = 3 → should use 3 (smaller). | ||
| sqlDB.Exec(t, `SET CLUSTER SETTING changefeed.sink_io_workers = 10`) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please also implement these tests as separate subtests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| func numSinkIOWorkers(cfg *execinfra.ServerConfig, opts changefeedbase.StatementOptions) int { | ||
| changefeedWorkers, err := opts.GetNumSinkWorkers() | ||
| if err != nil { | ||
| // Log error but continue with cluster setting. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do we expect errors here? I don't think this is the right behavior to trudge forward with 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetNumSinkWorkers() returns an error in case the user sets num_sink_workers to a value that cannot be parsed as an integer.
I opted for this approach as a value of 0 would lead to a sensible default number of workers while still creating the sink. But you have more context, so please let me know what you'd rather expect to happen.
Would you rather propagate the error and not create the sink?
| } else if clusterWorkers > 0 { | ||
| numWorkers = clusterWorkers | ||
| } else { | ||
| numWorkers = 0 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to micromanage here. But I think the max that you are now using here could use a comment. But for this case where one of changefeedWorkers and clusterWorkers is zero and one is negative or both are negative, it may become useful to have the local numWorkers variable maintain that information if the current implementation doesn't respect it. I'd like to see tests for these cases also e.g. what happens if one value is negative and the other positive,
P.S. Out of curiousity, are you using AI to generate this PR?
        
          
                pkg/ccl/changefeedccl/sink_test.go
              
                Outdated
          
        
      |  | ||
| result := numSinkIOWorkers(&execCfg.DistSQLSrv.ServerConfig, opts) | ||
|  | ||
| if test.expectUseDefault { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we should check that we're using the GOMAXPROCS-based default value exactly, not just that the result falls in a range from 1 to 32, since most of the test values fall in that range anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
| require.NoError(t, foo4.Close()) | ||
| } | ||
|  | ||
| // Test with sinks that support parallel IO. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is pubsub the only sink that supports parallel IO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaiu kafka, webhook and pubsub support parallel IO (my assumption is based on numSinkIOWorkers being called during the creation of these three types of sink).
Let me know what the intention of your question is:
For now, I updated the comment to reflect that we're testing with "...a sink that supports...".
Alternatively, it can of course be tested with all sinks that support parallel IO if you think that this provides additional value.
        
          
                pkg/ccl/changefeedccl/sink.go
              
                Outdated
          
        
      | clusterWorkers := changefeedbase.SinkIOWorkers.Get(&cfg.Settings.SV) | ||
|  | ||
| // Apply precedence logic: | ||
| // 1. If both are positive, use the smaller value (cluster can cap) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: To be clear, I think this comment is also reading the code out loud. If you are to include, please write in full sentences (it took me a sec to parse "cluster can cap". With periods and all that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the comment.
| @fabschn please also take a look at the extended CI failure. It may be unrelated but if it is, please verify that and leave a comment here. Thanks so much! | 
Previously, the number of SinkIOWorkers could only be defined in cluster-wide setting. This patch allows to set the number of SinkIOWorkers per changefeed. If both (cluster-wide, per-changefeed) values are given, the min of the two will be used. Release note (sql change): The CREATE CHANGEFEED statement was extended with an optional `num_sink_workers` setting that can be used to set the number of SinkIOWorkers per changefeed. Note that the number of workers is capped by the cluster-wide setting if present. Fixes: cockroachdb#154546
7213958    to
    d035bed      
    Compare
  
    | 
 I can't see details of the Bazel Exended CI job as I don't have access to TeamCity. Let me know if you can and notice that my changes have introduced an issue. The Claude Code PR Review job fails with the following error: 
 | 
Previously, the number of SinkIOWorkers could only be defined in cluster-wide setting.
This patch allows to set the number of SinkIOWorkers per changefeed.
If both (cluster-wide, per-changefeed) values are given, the min of the two will be used.
Release note (sql change): The CREATE CHANGEFEED statement was extended with an optional
num_sink_workerssetting that can be used to set the number of SinkIOWorkers per changefeed. Note that the number of workers is capped by the cluster-wide setting if present.Fixes: #154546