Skip to content

Conversation

@kevinAlbs
Copy link
Contributor

@kevinAlbs kevinAlbs commented Oct 7, 2024

Summary

Specify behavior with unacknowledged writes for MongoClient.bulkWrite:

  • Return error if verbose results are requested.
  • Return error if ordered writes are requested.
  • Apply w:0 to each batch.

Existing spec and prose tests using MongoClient.bulkWrite with unacknowledged write concern have been updated to use ordered: false. The default (ordered: true) is now expected to result in a client-side error.

Background & Motivation

See DRIVERS-2993 for motivation.

This PR expects the OP_MSG moreToCome bit is always set for all batches. This agrees with OP_MSG:

Clients doing unacknowledged writes MUST set the moreToCome flag, and MUST set the writeConcern to w=0.

Prose Test

A CRUD prose test is added to insert 100,001 documents with w:0 write concern to test multi-batch unacknowledged write.

The resulting collection count is checked to ensure the unacknowledged writes completes and avoid interfering with later tests. This was similarly done in DRIVERS-2877.

The prose test includes a step to create the collection with the create command as a workaround for SERVER-95537.


Please complete the following before merging:

  • Update changelog.
  • Make sure there are generated JSON files from the YAML test files.
  • Test changes in at least one language driver.
  • Test these changes against all server versions and topologies (including standalone, replica set, sharded
    clusters, and serverless).

Tests verified with the C driver in mongodb/mongo-c-driver#1752

Call `countDocuments` to ensure unacknowledged writes complete to avoid interfering with later tests.

Require single mongos to ensure connection is reused between `bulkWrite` and `countDocuments`.

Require collection be explicitly created to work around SERVER-95537.
@kevinAlbs kevinAlbs marked this pull request as ready for review October 7, 2024 18:09
@kevinAlbs kevinAlbs requested review from a team as code owners October 7, 2024 18:09
@kevinAlbs kevinAlbs requested review from dariakp, isabelatkinson, nbbeeken and stIncMale and removed request for a team, dariakp and nbbeeken October 7, 2024 18:09
}
```

Construct a list of write models (referred to as `models`) with `model` repeated `maxWriteBatchSize + 1` times.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this test execute faster if it used 5 16MB documents, rather than 100,001 small ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Testing the C driver with a sharded 8.0 cluster locally shows the median of 3 runs:

With many small docs to exceed `maxWriteBatchSize`  : 0.65s
With few large docs to exceed `maxMessageSizeBytes` : 0.35s

Updated prose test to use few large docs.

Use a few large documents to force a batch split, rather than many small documents. This appears to be faster in local tests (0.65s to 0.35s)
@kevinAlbs kevinAlbs requested a review from jyemin October 7, 2024 19:50
`BulkWriteOptions`. This is required to avoid inconsistencies between server and driver behavior if the server default
changes in the future.

Drivers MUST return a client-side error if `ordered` is true (including when default is applied) with an unacknowledged
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should drivers that already released this api do? Should we continue supporting it until the next major version? Drop it in a minor version? We usually try to never do the latter but I'd be willing to make an expection here since unack'd writes are so niche and the fix is simple, just add ordered=False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should drivers that already released this api do?

I think this only currently applies to Python and C (Rust does not implement unacknowledged writes). I was not planning to wait for a major release to fix in C.

Arguably, the previous behavior of allowing ordered+w:0 is a bug, since the behavior does not match the description:

writes will stop executing if an individual write fails

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, Python does not have that bug. We send w:1 so we can stop if a batch fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShaneHarvey: what is the bug you're referring to? I'm a bit confused how PyMongo sending w:1 relates to the user specifying w:0 with ordered:true. Does that mean PyMongo is already disregarding the user's original write concern in order to catch failed batches?

Copy link
Member

@ShaneHarvey ShaneHarvey Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is what drivers are supposed to do. With ordered:true + w:0, all but the last batch use w:1 and only the final batch uses w:0. I'm not sure if this is specified anywhere but that's how pymongo's collection.bulk_write works.

That said, I'm fine with dropping support for this in our next release (for client.bulk_write only).

@jyemin jyemin removed their request for review October 11, 2024 11:42
@kevinAlbs kevinAlbs merged commit c84488c into mongodb:master Oct 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants