diff --git a/config/redirects b/config/redirects index 244e2d0979c..57201f23d13 100644 --- a/config/redirects +++ b/config/redirects @@ -1811,6 +1811,13 @@ raw: /manual/core/wildcard -> ${base}/manual/core/index-wildcard/ [v5.0-*]: /${version}/tutorial/model-time-data -> ${base}/${version}/tutorial/model-iot-data/ [v3.6-v4.4]: /${version}/tutorial/model-iot-data -> ${base}/${version}/tutorial/model-time-data/ +[v3.6-v4.4]: /${version}/core/sharding-shard-a-collection.txt -> ${base}/${version}/core/sharding-shard-key.txt +[v3.6-v4.4]: /${version}/core/sharding-choose-a-shard-key.txt -> ${base}/${version}/core/sharding-shard-key.txt +[v3.6-v4.4]: /${version}/core/sharding-refine-a-shard-key.txt -> ${base}/${version}/core/sharding-shard-key.txt +[v3.6-v4.4]: /${version}/core/sharding-change-shard-key-value.txt -> ${base}/${version}/core/sharding-shard-key.txt +[v3.6-v4.4]: /${version}/core/sharding-set-missing-shard-key-fields.txt -> ${base}/${version}/core/sharding-shard-key.txt +[v3.6-v4.4]: /${version}/core/sharding-find-shard-key.txt -> ${base}/${version}/core/sharding-shard-key.txt + # # Upgrade / Downgrade redirects # diff --git a/snooty.toml b/snooty.toml index f23d4b56ffd..312aa63c2e0 100644 --- a/snooty.toml +++ b/snooty.toml @@ -70,8 +70,10 @@ toc_landing_pages = [ "/core/security-users", "/core/security-x.509", "/core/sharded-cluster-components", + "/core/sharding-change-a-shard-key", "/core/sharding-balancer-administration", "/core/sharding-data-partitioning", + "/core/sharding-shard-key", "/core/storage-engines", "/core/timeseries-collections", "/core/transactions", diff --git a/source/administration/production-checklist-development.txt b/source/administration/production-checklist-development.txt index 4c120612df4..a2f1c7654ad 100644 --- a/source/administration/production-checklist-development.txt +++ b/source/administration/production-checklist-development.txt @@ -44,15 +44,14 @@ structures. See :doc:`/core/data-models` for more information. indexes required to support your queries. With the exception of the ``_id`` index, you must create all indexes explicitly: MongoDB does not automatically create any indexes other than ``_id``. - + - Ensure that your schema design supports your deployment type: if - you planning to use :term:`sharded clusters ` for - horizontal scaling, design your schema to include a strong shard - key. The shard key affects read and write performance by - determining how MongoDB partitions data. See: :doc:`Impacts of - Shard Keys on Cluster Operations ` - for information about what qualities a shard key should possess. - You cannot change the shard key once it is set. + you are planning to use :term:`sharded clusters ` + for horizontal scaling, design your schema to include a strong + shard key. While you can :ref:`change your shard key + ` later, it is important to carefully consider + your :ref:`shard key choice ` to + avoid scalability and perfomance issues. - Ensure that your schema design does not rely on indexed arrays that grow in length without bound. Typically, best performance can diff --git a/source/core/data-model-operations.txt b/source/core/data-model-operations.txt index cb492846160..6cc37b43797 100644 --- a/source/core/data-model-operations.txt +++ b/source/core/data-model-operations.txt @@ -66,8 +66,9 @@ To distribute data and application traffic in a sharded collection, MongoDB uses the :ref:`shard key `. Selecting the proper :ref:`shard key ` has significant implications for performance, and can enable or prevent query isolation and increased -write capacity. It is important to consider carefully the field or -fields to use as the shard key. +write capacity. While you can :ref:`change your shard key +` later, it is important to carefully consider your +shard key choice. See :doc:`/sharding` and :doc:`/core/sharding-shard-key` for more information. diff --git a/source/core/distributed-queries.txt b/source/core/distributed-queries.txt index b4917eae2c6..c2a0d0ecbd7 100644 --- a/source/core/distributed-queries.txt +++ b/source/core/distributed-queries.txt @@ -68,6 +68,8 @@ For more information on replica sets and write operations, see :doc:`/reference/write-concern`. +.. _read-operations-sharded-clusters: + Read Operations to Sharded Clusters ----------------------------------- diff --git a/source/core/hashed-sharding.txt b/source/core/hashed-sharding.txt index 2f034c6cfd4..02c23ad0415 100644 --- a/source/core/hashed-sharding.txt +++ b/source/core/hashed-sharding.txt @@ -10,7 +10,7 @@ Hashed Sharding Hashed sharding uses either a :ref:`single field hashed index ` or a :ref:`compound hashed index ` (*New in 4.4*) as the shard key to -partition data across your cluster. +partition data across your cluster. Sharding on a Single Field Hashed Index Hashed sharding provides a more even data distribution across the @@ -35,7 +35,7 @@ Sharding on a Compound Hashed Index Compound hashed index compute the hash value of a single field in the compound index; this value is used along with the other fields in the - index as your shard key. + index as your shard key. Compound hashed sharding supports features like :ref:`zone sharding `, where the prefix (i.e. first) non-hashed field or @@ -103,7 +103,7 @@ to use as the :term:`shard key`. sh.shardCollection( "database.collection", { : "hashed" } ) -To shard a collection on a +To shard a collection on a :ref:`compound hashed index `, specify the full namespace of the collection and the target compound hashed index to use as the :term:`shard key`: @@ -111,23 +111,20 @@ hashed index to use as the :term:`shard key`: .. code-block:: javascript sh.shardCollection( - "database.collection", + "database.collection", { "fieldA" : 1, "fieldB" : 1, "fieldC" : "hashed" } ) .. important:: - - Starting in MongoDB 4.4, you can refine a collection's shard key - by adding a suffix field or fields to the existing key. In earlier - versions, once you shard a collection, the selection of the shard - key is immutable; i.e. you cannot select a different shard key for - that collection. For details on refining a shard key, see - :ref:`shard-key-refine`. + - Starting in MongoDB 5.0, you can :ref:`reshard a collection + ` by changing a collection's shard key. + - Starting in MongoDB 4.4, you can :ref:`refine a shard key + ` by adding a suffix field or fields to the + existing shard key. + - In MongoDB 4.2 and earlier, the choice of shard key cannot + be changed after sharding. - - .. include:: /includes/limits-sharding-shardkey-document-immutable.rst - - For details on updating the shard key, see :ref:`update-shard-key`. - Shard a Populated Collection ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/core/ranged-sharding.txt b/source/core/ranged-sharding.txt index d54a08c4856..a4238014b12 100644 --- a/source/core/ranged-sharding.txt +++ b/source/core/ranged-sharding.txt @@ -52,16 +52,14 @@ to use as the :term:`shard key`. .. important:: - - Starting in MongoDB 4.4, you can refine a collection's shard key - by adding a suffix field or fields to the existing key. In earlier - versions, once you shard a collection, the selection of the shard - key is immutable; i.e. you cannot select a different shard key for - that collection. For details on refining a shard key, see - :ref:`shard-key-refine`. - - - .. include:: /includes/limits-sharding-shardkey-document-immutable.rst - - For details on updating the shard key, see :ref:`update-shard-key`. + - Starting in MongoDB 5.0, you can :ref:`reshard a collection + ` by changing a collection's shard key. + - Starting in MongoDB 4.4, you can :ref:`refine a shard key + ` by adding a suffix field or fields to the existing + shard key. + - In MongoDB 4.2 and earlier, the choice of shard key cannot + be changed after sharding. + Shard a Populated Collection ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/core/sharded-cluster-config-servers.txt b/source/core/sharded-cluster-config-servers.txt index bedc52f667e..2eb37164bcf 100644 --- a/source/core/sharded-cluster-config-servers.txt +++ b/source/core/sharded-cluster-config-servers.txt @@ -62,8 +62,6 @@ Replica Set Config Servers .. include:: /includes/fact-config-server-replica-set-restrictions.rst -, config - .. _config-server-read-write-ops: Read and Write Operations on Config Servers @@ -143,7 +141,9 @@ Sharded Cluster Metadata Config servers store metadata in the :doc:`/reference/config-database`. -.. important:: Always back up the ``config`` database before doing any +.. important:: + + Always back up the ``config`` database before doing any maintenance on the config server. To access the ``config`` database, issue the following command from the diff --git a/source/core/sharding-change-a-shard-key.txt b/source/core/sharding-change-a-shard-key.txt new file mode 100644 index 00000000000..d3718df3b5d --- /dev/null +++ b/source/core/sharding-change-a-shard-key.txt @@ -0,0 +1,43 @@ +.. _change-a-shard-key: + +================== +Change a Shard Key +================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +The ideal shard key allows MongoDB to distribute documents evenly +throughout the cluster while facilitating common query patterns. A +suboptimal shard key can lead to uneven data distribution and the +following problems: + +- :ref:`Jumbo chunks ` +- :ref:`Uneven load distribution + ` +- :ref:`Decreased query performance over time + ` + +To address these issues, MongoDB allows you to change your shard key: + +- Starting in MongoDB 5.0, you can :ref:`reshard a collection + ` by changing a collection's shard key. +- Starting in MongoDB 4.4, you can :ref:`refine a shard key + ` by adding a suffix field or fields to the existing + shard key. + +In MongoDB 4.2 and earlier, a document's shard key is immutable. + +For more information on common performance and scaling issues and advice +on how to fix them, read :ref:`shardkey-troubleshoot-shard-keys`. + +.. toctree:: + :titlesonly: + + /core/sharding-refine-a-shard-key.txt + /core/sharding-reshard-a-collection.txt diff --git a/source/core/sharding-change-shard-key-value.txt b/source/core/sharding-change-shard-key-value.txt new file mode 100644 index 00000000000..4bd517d7bf0 --- /dev/null +++ b/source/core/sharding-change-shard-key-value.txt @@ -0,0 +1,98 @@ +.. _update-shard-key: + +=================================== +Change a Document's Shard Key Value +=================================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + + +Starting in MongoDB 4.2, you can update a document's shard key value +unless the shard key field is the immutable ``_id`` field. + +.. important:: When updating the shard key value + + - You **must** be on a :binary:`~bin.mongos`. Do **not** issue the + operation directly on the shard. + + - You **must** run either in a :doc:`transaction + ` or as a :doc:`retryable write + `. + + - You **must** include an equality condition on the full shard + key in the query filter. For example, consider a ``messages`` + collection that uses ``{ activityid: 1, userid : 1 }`` as the + shard key. To update the shard key value for a document, you must + include ``activityid: , userid: `` in the query + filter. You can include additional fields in the query as + appropriate. + + See also the specific write command/methods for additional + operation-specific requirements when run against a sharded + collection. + +To update a shard key value, use the following operations: + +.. list-table:: + :header-rows: 1 + :widths: 40 60 + + * - Command + - Method + + * - :ref:`update ` with ``multi: false`` + - | :ref:`db.collection.replaceOne() ` + | :ref:`db.collection.updateOne() ` + + To set to a non-``null`` value, the update :red:`must` be + performed either inside a transaction or as a retryable write. + + * - :ref:`findAndModify ` + - | :ref:`db.collection.findOneAndReplace() ` + | :ref:`db.collection.findOneAndUpdate() ` + | :ref:`db.collection.findAndModify() ` + + To set to a non-``null`` value, the update :red:`must` be + performed either inside a transaction or as a retryable write. + + * - + - | :method:`db.collection.bulkWrite()` + | :method:`Bulk.find.updateOne()` + + If the shard key modification results in moving the document to + another shard, you cannot specify more than one shard key + modification in the bulk operation; the batch size has to be 1. + + If the shard key modification does not result in moving the + document to another shard, you can specify multiple shard + key modification in the bulk operation. + + To set to a non-``null`` value, the operation :red:`must` be + performed either inside a transaction or as a retryable write. + +.. include:: /includes/shard-key-modification-warning.rst + +Example +------- + +Consider a ``sales`` collection which is sharded on the ``location`` +field. The collection contains a document with the ``_id`` +``12345`` and the ``location`` ``""``. To update the field value for +this document, you can run the following command: + +.. code-block:: javascript + + db.sales.updateOne( + { _id: 12345, location: "" }, + { $set: { location: "New York"} } + ) + +.. seealso:: + + :ref:`shard-key-missing-set` diff --git a/source/core/sharding-choose-a-shard-key.txt b/source/core/sharding-choose-a-shard-key.txt new file mode 100644 index 00000000000..4253bb9ca3c --- /dev/null +++ b/source/core/sharding-choose-a-shard-key.txt @@ -0,0 +1,187 @@ +.. _shard-key-selection-divisible: +.. _sharding-internals-operations-and-reliability: +.. _sharding-shard-key-selection: +.. _sharding-internals-choose-shard-key: +.. _sharding-shard-key-requirements: + +================== +Choose a Shard Key +================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +The choice of shard key affects the creation and :ref:`distribution of +chunks ` across the available :term:`shards +`. The distribution of data affects the efficiency and +performance of operations within the sharded cluster. + +The ideal shard key allows MongoDB to distribute documents evenly +throughout the cluster while also facilitating common query patterns. + +When you choose your shard key, consider: + +- the :ref:`cardinality` of the shard key +- the :ref:`frequency` with which shard key values + occur +- whether a potential shard key grows :ref:`monotonically + ` +- :ref:`sharding-query-patterns` +- :ref:`limits-shard-keys` + +.. note:: + + - Starting in MongoDB 5.0, you can :ref:`change your shard key + ` and redistribute your data using the + :dbcommand:`reshardCollection` command. + - Starting in MongoDB 4.4, you can use the + :dbcommand:`refineCollectionShardKey` command to refine a + collection's shard key. The :dbcommand:`refineCollectionShardKey` + command adds a suffix field or fields to the existing key to create + the new shard key. + - In MongoDB 4.2 and earlier, once you shard a collection, the + selection of the shard key is immutable. + - Starting in MongoDB 4.2, you can update a document's shard key + value unless the shard key field is the immutable ``_id`` field. + +.. _shard-key-range: +.. _sharding-shard-key-cardinality: +.. _shard-key-cardinality: + +Shard Key Cardinality +--------------------- + +The :term:`cardinality` of a shard key determines the maximum number of +chunks the balancer can create. Where possible, choose a shard key with +high cardinality. A shard key with low cardinality reduces the +effectiveness of horizontal scaling in the cluster. + +Each unique shard key value can exist on no more than a single chunk at +any given time. Consider a dataset that contains user data with a +``continent`` field. If you chose to shard on ``continent``, the shard +key would have a cardinality of ``7``. A cardinality of ``7`` means +there can be no more than ``7`` chunks within the sharded cluster, each +storing one unique shard key value. This constrains the number of +effective shards in the cluster to ``7`` as well - adding more than +seven shards would not provide any benefit. + +The following image illustrates a sharded cluster using the field ``X`` +as the shard key. If ``X`` has low cardinality, the distribution of +inserts may look similar to the following: + +.. include:: /images/sharded-cluster-ranged-distribution-low-cardinal.rst + +If your data model requires sharding on a key that has low cardinality, +consider using an indexed :term:`compound ` of fields to +increase cardinality. + +A shard key with high cardinality does not, on its own, guarantee even +distribution of data across the sharded cluster. The :ref:`frequency +` of the shard key and the potential for +:ref:`monotonically changing shard key values ` +also contribute to the distribution of the data. + +.. _shard-key-frequency: + +Shard Key Frequency +------------------- + +The ``frequency`` of the shard key represents how often a given shard +key value occurs in the data. If the majority of documents contain only +a subset of the possible shard key values, then the chunks storing the +documents with those values can become a bottleneck within the cluster. +Furthermore, as those chunks grow, they may become :ref:`indivisible +chunks ` as they cannot be split any further. This reduces +the effectiveness of horizontal scaling within the cluster. + +The following image illustrates a sharded cluster using the field ``X`` as the +shard key. If a subset of values for ``X`` occur with high frequency, the +distribution of inserts may look similar to the following: + +.. include:: /images/sharded-cluster-ranged-distribution-frequency.rst + +If your data model requires sharding on a key that has high frequency +values, consider using a :term:`compound index` using a unique or +low frequency value. + +A shard key with low frequency does not, on its own, guarantee even +distribution of data across the sharded cluster. The :ref:`cardinality +` of the shard key and the potential for +:ref:`monotonically changing shard key values ` +also contribute to the distribution of the data. + +.. _shard-key-monotonic: + +Monotonically Changing Shard Keys +--------------------------------- + +A shard key on a value that increases or decreases monotonically is more +likely to distribute inserts to a single chunk within the cluster. + +This occurs because every cluster has a chunk that captures a range with +an upper bound of :doc:`maxKey`. ``maxKey`` +always compares as higher than all other values. Similarly, there is a +chunk that captures a range with a lower bound of +:doc:`minKey`. ``minKey`` always compares as +lower than all other values. + +If the shard key value is always increasing, all new inserts are routed +to the chunk with ``maxKey`` as the upper bound. If the shard key value +is always decreasing, all new inserts are routed to the chunk with +``minKey`` as the lower bound. The shard containing that chunk becomes +the bottleneck for write operations. + +To optimize data distribution, the chunks that contain the global +``maxKey`` (or ``minKey``) do not stay on the same shard. When a chunk +is split, the new chunk with the ``maxKey`` (or ``minKey``) chunk is +located on a different shard. + +The following image illustrates a sharded cluster using the field ``X`` +as the shard key. If the values for ``X`` are monotonically increasing, the +distribution of inserts may look similar to the following: + +.. include:: /images/sharded-cluster-monotonic-distribution.rst + +If the shard key value was monotonically decreasing, then all inserts +would route to ``Chunk A`` instead. + +If your data model requires sharding on a key that changes +monotonically, consider using :doc:`/core/hashed-sharding`. + +A shard key that does not change monotonically does not, on its own, +guarantee even distribution of data across the sharded cluster. The +:ref:`cardinality` and +:ref:`frequency` of the shard key also contribute +to the distribution of the data. + +.. _sharding-query-patterns: + +Sharding Query Patterns +----------------------- + +The ideal shard key distributes data evenly across the sharded cluster +while also facilitating common query patterns. When you choose a shard +key, consider your most common query patterns and whether a given shard +key covers them. + +In a sharded cluster, the :binary:`~bin.mongos` routes queries to only +the shards that contain the relevant data if the queries contain the +shard key. When the queries do not contain the shard key, the queries +are broadcast to all shards for evaluation. These types of queries are +called scatter-gather queries. Queries that involve multiple shards for +each request are less efficient and do not scale linearly when more +shards are added to the cluster. + +This does not apply for aggregation queries that operate on a large +amount of data. In these cases, scatter-gather can be a useful approach +that allows the query to run in parallel on all shards. + +.. seealso:: + + :ref:`read-operations-sharded-clusters` + :ref:`sharding-query-router-broadcast-targeted` diff --git a/source/core/sharding-data-partitioning.txt b/source/core/sharding-data-partitioning.txt index f60a6626dff..0f9a4bb4720 100644 --- a/source/core/sharding-data-partitioning.txt +++ b/source/core/sharding-data-partitioning.txt @@ -217,7 +217,10 @@ becoming a **jumbo** chunk. These **jumbo** chunks can become a performance bott as they continue to grow, especially if the shard key value occurs with high :ref:`frequency`. -Starting in version 4.4, MongoDB provides the +Starting in MongoDB 5.0, you can :ref:`reshard a collection +` by changing a document's shard key. + +Starting in MongoDB 4.4, MongoDB provides the :dbcommand:`refineCollectionShardKey` command. Refining a collection's shard key allows for a more fine-grained data distribution and can address situations where the existing key insufficient cardinality @@ -225,12 +228,12 @@ leads to jumbo chunks. For more information, see: -- :dbcommand:`refineCollectionShardKey` +- :ref:`change-a-shard-key` - :doc:`/tutorial/clear-jumbo-flag` - :ref:`migration-chunk-size-limit` - + .. _moveChunk-directory: ``moveChunk`` directory diff --git a/source/core/sharding-find-shard-key.txt b/source/core/sharding-find-shard-key.txt new file mode 100644 index 00000000000..ec0e72a1295 --- /dev/null +++ b/source/core/sharding-find-shard-key.txt @@ -0,0 +1,23 @@ +.. _sharding-find-shard-key: + +================ +Find a Shard Key +================ + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +To find the shard key of an existing sharded collection, use the +:method:`db.printShardingStatus()` method: + +.. code-block:: javascript + + db.printShardingStatus() + +For details on the :method:`db.printShardingStatus()` output, see +:method:`sh.status()`. \ No newline at end of file diff --git a/source/core/sharding-refine-a-shard-key.txt b/source/core/sharding-refine-a-shard-key.txt new file mode 100644 index 00000000000..cf6b0c0886d --- /dev/null +++ b/source/core/sharding-refine-a-shard-key.txt @@ -0,0 +1,43 @@ +.. _shard-key-refine: + +================== +Refine a Shard Key +================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 3 + :class: singlecol + +.. versionadded:: 4.4 + +Refining a collection's shard key allows for a more fine-grained +data distribution and can address situations where the existing key +has led to :ref:`jumbo chunks ` due to insufficient +:ref:`cardinality `. + +.. note:: + + Starting in MongoDB 5.0, you can also :ref:`reshard your collection + ` by providing a new shard key for the + collection. + +To refine a collection's shard key, use the +:dbcommand:`refineCollectionShardKey` command. The +:dbcommand:`refineCollectionShardKey` adds a suffix field +or fields to the existing key to create the new shard key. + +For example, you may have an existing ``orders`` collection in a +``test`` database with the shard key ``{ customer_id: 1 }``. You can +use the :dbcommand:`refineCollectionShardKey` command to change the +shard key to the new shard key ``{ customer_id: 1, order_id: 1 }``: + +.. code-block:: javascript + + db.adminCommand( { + refineCollectionShardKey: "test.orders", + key: { customer_id: 1, order_id: 1 } + } ) diff --git a/source/core/sharding-reshard-a-collection.txt b/source/core/sharding-reshard-a-collection.txt new file mode 100644 index 00000000000..455984fd4c6 --- /dev/null +++ b/source/core/sharding-reshard-a-collection.txt @@ -0,0 +1,213 @@ +.. _sharding-resharding: + +==================== +Reshard a Collection +==================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 3 + :class: singlecol + +.. versionadded:: 5.0 + +The ideal shard key allows MongoDB to distribute documents evenly +throughout the cluster while facilitating common query patterns. A +suboptimal shard key can lead to performance or scaling issues due to +uneven data distribution. Starting in MongoDB 5.0, you can change the +shard key for a collection to change the distribution of your data +across a cluster. + +.. note:: + + Before resharding your collection, read + :ref:`shardkey-troubleshoot-shard-keys` for information on common + performance and scaling issues and advice on how to fix them. + +Requirements +-------------- + +Before you reshard your collection, ensure that you meet the following +requirements: + +- Your application can tolerate a period of **two seconds** where the + collection that is being resharded blocks writes. During the time + period where writes are blocked your application experiences an + increase in latency. If your workload cannot tolerate this + requirement, consider :ref:`refining your shard key + ` instead. +- Your database meets these resource requirements: + + - Available storage space: Ensure that your available storage space + is at least 1.2x the size of the collection that you want to + reshard. For example, if the size of the collection you want to + reshard is 1 TB, you should have at least 1.2 TB of free storage + when starting the sharding operation. + - I/O: Ensure that your I/O capacity is below 50%. + - CPU load: Ensure your CPU load is below 80%. + + .. important:: + + These requirements are not enforced by the database. A failure to + allocate enough resources can result in: + + - the database running out of space and shutting down + - decreased performance + - the resharding operation taking longer than expected + + If your application has time periods with less traffic, reshard your + collection during that time if possible. + +- You have rewritten your application code to update your queries to use + **both** the current shard key and the new shard key. + + The following queries return an error if the query filter does not + include **both** the current shard key or a unique field (like + ``_id``): + + - :method:`~db.collection.deleteOne()` + - :method:`~db.collection.findAndModify()` + - :method:`~db.collection.findOneAndDelete()` + - :method:`~db.collection.findOneAndReplace()` + - :method:`~db.collection.findOneAndUpdate()` + - :method:`~db.collection.replaceOne()` + - :method:`~db.collection.updateOne()` + + For optimal performance, we recommend that you also rewrite other + queries to include the new shard key. + + Once the resharding operation completes, you can remove the old shard + key from the queries. + +- No index builds are in progress. Use ``db.currentOp()`` to + check for any running index builds: + + .. code-block:: javascript + + db.adminCommand( + { + currentOp: true, + $or: [ + { op: "command", "command.createIndexes": { $exists: true } }, + { op: "none", "msg" : /^Index Build/ } + ] + } + ) + + In the result document, if the ``inprog`` field value is an empty + array, there are no index builds in progress: + + .. code-block:: javascript + + { + inprog: [], + ok: 1, + '$clusterTime': { ... }, + operationTime: + } + +.. warning:: + + We strongly recommend that you check the + :ref:`resharding-limitations` and read the :ref:`resharding + process ` section in full before resharding your + collection. + +.. _resharding-limitations: + +Limitations +----------- + +- Only one collection can be resharded at a time. +- :rsconf:`writeConcernMajorityJournalDefault` must be ``true``. +- Resharding a collection that has a + :doc:`uniqueness ` constraint is not supported. +- The new shard key cannot have a :doc:`uniqueness ` + constraint. +- The following commands and corresponding shell methods are not + supported on the collection that is being resharded while the + resharding operation is in progress: + + - :dbcommand:`collMod` + - :dbcommand:`convertToCapped` + - :dbcommand:`createIndexes` + - :method:`~db.collection.createIndex()` + - :dbcommand:`drop` + - :method:`~db.collection.drop()` + - :dbcommand:`dropIndexes` + - :method:`~db.collection.dropIndex()` + - :dbcommand:`renameCollection` + - :method:`~db.collection.renameCollection()` + +- The following commands and methods are not supported on the cluster + while the resharding operation is in progress: + + - :dbcommand:`addShard` + - :dbcommand:`removeShard` + - :method:`db.createCollection()` + - :dbcommand:`dropDatabase` + + .. warning:: + + Using any of the preceding commands during a resharding + operation causes the resharding operation to fail. + +- If the collection to be resharded uses :atlas:`Atlas Search + `, the search index will become unavailable when the + resharding operation completes. You need to manually rebuild the + search index once the resharding operation completes. + +.. _resharding_process: + +Resharding Process +------------------ + +.. include:: /includes/steps/reshard-a-collection.rst + +Behavior +-------- + +Minimum Duration of a Resharding Operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The minimum duration of a resharding operation is always 5 minutes. + +Retryable Writes +~~~~~~~~~~~~~~~~ + +:ref:`Retryable writes ` initiated before or during +resharding can be retried during and after the collection has been +resharded for up to 5 minutes. After 5 minutes you may be unable to find +the definitive result of the write and subsequent attempts to retry the +write fail with an ``IncompleteTransactionHistory`` error. + +Error Cases +----------- + +Primary Failovers +~~~~~~~~~~~~~~~~~ + +If a primary failover on a replica set shard or config server occurs, +the resharding operation aborts. + +If a resharding operation aborts due to a primary failover, run the +:dbcommand:`cleanupReshardCollection` command before starting a new +resharding operation: + +.. code-block:: javascript + + db.runCommand({ + cleanupReshardCollection: "." + }) + +Duplicate ``_id`` Values +~~~~~~~~~~~~~~~~~~~~~~~~ + +The resharding operation fails if ``_id`` values are not globally unique +to avoid corrupting collection data. Duplicate ``_id`` values can also +prevent successful chunk migration. If you have documents with duplicate +``_id`` values, copy the data from each into a new document, and then +delete the duplicate documents. diff --git a/source/core/sharding-set-missing-shard-key-fields.txt b/source/core/sharding-set-missing-shard-key-fields.txt new file mode 100644 index 00000000000..5f9d2eff385 --- /dev/null +++ b/source/core/sharding-set-missing-shard-key-fields.txt @@ -0,0 +1,141 @@ +.. _shard-key-missing-set: + +============================ +Set Missing Shard Key Fields +============================ + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +If you have missing shard key fields, you can set the shard key field to +``null``. If you want to set the missing shard key field to a +non-``null`` value, see :ref:`update-shard-key`. + +To perform the update, you can use the following operations on a +:binary:`~bin.mongos`: + +.. list-table:: + :header-rows: 1 + :widths: 20 20 60 + + * - Command + - Method + - Description + + * - | :ref:`update ` with + | ``multi: true`` + + - | :ref:`db.collection.updateMany() ` + + - - Can be used to set the missing key value to ``null`` only. + + - Can be performed inside or outside a transaction. + + - Can be performed as a retryable write or not. + + - For additional requirements, refer to the specific + command/method. + + * - | :ref:`update ` with + | ``multi: false`` + + - | :ref:`db.collection.replaceOne() ` + | :ref:`db.collection.updateOne() ` + + - - Can be used to set the missing key value to ``null`` or any + other value. + + - The update to set missing shard key fields **must** meet one of + the following requirements: + + - the filter of the query contains an equality condition on the full + shard key in the query + - the filter of the query contains an exact match on _id + - the update targets a single shard + + - To set to a non-``null`` value, refer to + :ref:`update-shard-key`. + + - For additional requirements, refer to the specific + command/method. + + * - :ref:`findAndModify ` + - | :ref:`db.collection.findOneAndReplace() ` + | :ref:`db.collection.findOneAndUpdate() ` + | :ref:`db.collection.findAndModify() ` + + - - Can be used to set the missing key value to ``null`` or any + other value. + + - When setting missing shard key fields with a method that + explicitly updates only one document, the update **must** meet + one of the following requirements: + + - the filter of the query contains an equality condition on + the full shard key in the query + - the filter of the query contains an exact match on _id + - the update targets a single shard + + - Missing key values are returned when matching on ``null``. To + avoid updating a key value that is ``null``, include additional + query conditions as appropriate. + + - To set to a non-``null`` value, refer to + :ref:`update-shard-key`. + + - For additional requirements, refer to the specific + command/method. + + * - + - | :method:`db.collection.bulkWrite()` + | :method:`Bulk.find.replaceOne()` + | :method:`Bulk.find.updateOne()` + | :method:`Bulk.find.update()` + + - - To set to a ``null`` value, you can specify multiple shard + key modifications in the bulk operation. + + - When setting missing shard key fields with a method that + explicitly updates only one document, the update **must** meet + one of the following requirements: + + - the filter of the query contains an equality condition on + the full shard key in the query + - the filter of the query contains an exact match on _id + - the update targets a single shard + + - To set to a non-``null`` value, refer to + :ref:`update-shard-key`. + + - For additional requirements, refer to the underlying + command/method. + +Example +------- + +Consider a ``sales`` collection which is sharded on the ``location`` +field. Some documents in the collection have no ``location`` field. A +missing field is considered the same as a null value for the field. To +explicitly set these fields to ``null``, run the following command: + +.. code-block:: javascript + + db.sales.updateOne( + { _id: 12345, location: null }, + { $set: { location: null } } + ) + +When setting missing shard key fields with +:method:`db.collection.updateOne()` or another method that explicitly +updates only one document, the update **must** meet one of the following +requirements: + +- the filter of the query contains an equality condition on the full + shard key in the query +- the filter of the query contains an exact match on _id +- the update targets a single Shard diff --git a/source/core/sharding-shard-a-collection.txt b/source/core/sharding-shard-a-collection.txt new file mode 100644 index 00000000000..e80b51eaaad --- /dev/null +++ b/source/core/sharding-shard-a-collection.txt @@ -0,0 +1,82 @@ +.. _sharding-shard-key-creation: + +================== +Shard a Collection +================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +.. note:: + + To shard a collection, you need to :ref:`enable sharding + `. + +To shard a collection, you must specify the full namespace of the +collection that you want to shard and the shard key. You can use the +:mongosh:`MongoDB Shell ` method :method:`sh.shardCollection()` to +shard a collection: + +.. code-block:: javascript + + sh.shardCollection(, ) // Optional parameters omitted + +.. list-table:: + :widths: 20 80 + + * - ``namespace`` + + - Specify the full namespace of the collection that you want to + shard (``"."``). + + * - ``key`` + + - Specify a document ``{ : <1|"hashed">, ... }`` + where + + - ``1`` indicates :doc:`range-based sharding ` + + - ``"hashed"`` indicates :doc:`hashed sharding `. + +For more information on the sharding method, see +:method:`sh.shardCollection()`. + +Shard Key Fields and Values +--------------------------- + +Missing Shard Key Fields +~~~~~~~~~~~~~~~~~~~~~~~~ + +Starting in version 4.4, documents in sharded collections can be +missing the shard key fields. A missing shard key falls into the +same range as a ``null``-valued shard key. See :ref:`shard-key-missing`. + +In version 4.2 and earlier, shard key fields must exist in every +document to be able to shard a sharded collection. To set missing shard +key fields, see :ref:`shard-key-missing-set`. + +Change a Document's Shard Key Value +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. include:: /includes/limits-sharding-shardkey-document-immutable.rst + +For details on updating the shard key value, see +:ref:`update-shard-key`. + +Change a Collection's Shard Key +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Starting in MongoDB 5.0, you can :ref:`reshard a collection +` by changing a document's shard key. + +Starting in MongoDB 4.4, you can :ref:`refine a shard key +` by adding a suffix field or fields to the existing +shard key. + +In MongoDB 4.2 and earlier, the choice of shard key cannot be changed +after sharding. diff --git a/source/core/sharding-shard-key.txt b/source/core/sharding-shard-key.txt index a39979ea01c..114fb29f4da 100644 --- a/source/core/sharding-shard-key.txt +++ b/source/core/sharding-shard-key.txt @@ -14,12 +14,12 @@ Shard Keys :depth: 1 :class: singlecol -The shard key is either an indexed :term:`field` or indexed -:term:`compound ` fields that determines the +The shard key is either a single indexed :term:`field` or multiple +fields covered by a :term:`compound index` that determines the distribution of the collection's :term:`documents ` among the cluster's :term:`shards `. -Specifically, MongoDB divides the span of shard key values (or hashed +MongoDB divides the span of shard key values (or hashed shard key values) into non-overlapping ranges of shard key values (or hashed shard key values). Each range is associated with a :term:`chunk`, and MongoDB attempts to distribute chunks evenly among @@ -30,78 +30,6 @@ the shards in the cluster. The shard key has a direct relationship to the effectiveness of chunk distribution. See :ref:`sharding-shard-key-selection`. -.. _sharding-shard-key-creation: - -Shard Key Specification ------------------------ - -You must specify the shard key when you shard the collection. You can -use the :binary:`~bin.mongo` shell method -:method:`sh.shardCollection()` to shard a collection: - -.. code-block:: javascript - - sh.shardCollection(, ) // Optional parameters omitted - -.. list-table:: - :widths: 20 80 - - * - ``namespace`` - - - Specify the full namespace of the collection (i.e. - ".") to shard. - - * - ``key`` - - - Specify a document ``{ : <1|"hashed">, ... }`` where - - - ``1`` indicates :doc:`range-based sharding ` - - - ``"hashed"`` indicates :doc:`hashed sharding `. - -.. note:: Shard Key Fields and Values - - Existence - Starting in version 4.4, documents in sharded collections can be - missing the shard key fields. A missing shard key falls into the - same range as a ``null``-valued shard key. See - :ref:`shard-key-missing`. - - In version 4.2 and earlier, shard key fields must exist in every - document for a sharded collection. - - Update Field's Value - .. include:: /includes/limits-sharding-shardkey-document-immutable.rst - - For details on updating the shard key, see :ref:`update-shard-key`. - -For more information on the sharding method, see -:method:`sh.shardCollection()`. - -.. _shard-key-refine: - -Refine a Shard Key ------------------- - -Starting in MongoDB 4.4, you can use -:dbcommand:`refineCollectionShardKey` to refine a collection's shard -key. The :dbcommand:`refineCollectionShardKey` adds a suffix field -or fields to the existing key to create the new shard key. - -For example, you may have an existing ``orders`` collection with the -shard key ``{ customer_id: 1 }``. You can change the shard key by -adding a suffix ``order_id`` field to the shard key so that ``{ -customer_id: 1, order_id: 1 }`` becomes the new shard key. For more -information, see the :dbcommand:`refineCollectionShardKey` command. - -Refining a collection's shard key allows for a more fine-grained -data distribution and can address situations where the existing key -has led to :ref:`jumbo (i.e. indivisible) chunks ` due -to insufficient cardinality. - -In MongoDB 4.2 and earlier, the choice of shard key cannot be changed -after sharding. - .. _sharding-internals-shard-key-indexes: .. _sharding-shard-key-indexes: @@ -109,7 +37,7 @@ Shard Key Indexes ----------------- All sharded collections **must** have an index that supports the -:term:`shard key`; i.e. the index can be an index on the shard key or a +:term:`shard key`. The index can be an index on the shard key or a :term:`compound index` where the shard key is a :ref:`prefix ` of the index. @@ -181,238 +109,25 @@ parameter as ``true`` to the :method:`sh.shardCollection()` method: You cannot specify a unique constraint on a :ref:`hashed index `. - - -.. _shard-key-selection-divisible: -.. _sharding-internals-operations-and-reliability: -.. _sharding-shard-key-selection: -.. _sharding-internals-choose-shard-key: -.. _sharding-shard-key-requirements: - -Choosing a Shard Key --------------------- - -The choice of shard key affects the creation and :ref:`distribution of -the chunks ` across the available :term:`shards -`. This affects the overall efficiency and performance of -operations within the sharded cluster. - -The ideal shard key allows MongoDB to distribute documents evenly throughout -the cluster. See also :ref:`sharding strategy `. - -At minimum, consider the consequences of the -:ref:`cardinality`, :ref:`frequency`, and -rate of :ref:`change` of a potential shard key. - -.. note:: - - - Starting in MongoDB 4.4, you can use - :dbcommand:`refineCollectionShardKey` to refine a collection's - shard key. The :dbcommand:`refineCollectionShardKey` adds a suffix - field or fields to the existing key to create the new shard key. - - - In MongoDB 4.2 and earlier, once you shard a collection, the - selection of the shard key is immutable. - -Restrictions -~~~~~~~~~~~~ - -For restrictions on shard key, see :ref:`limits-shard-keys`. - -Collection Size -~~~~~~~~~~~~~~~ - -When sharding a collection that is not empty, the shard key can -constrain the maximum supported collection size for the initial -sharding operation only. See -:limit:`Sharding Existing Collection Data Size`. - -.. important:: - - A sharded collection can grow to any size after successful sharding. - -.. _shard-key-range: -.. _sharding-shard-key-cardinality: -.. _shard-key-cardinality: - -Shard Key Cardinality -~~~~~~~~~~~~~~~~~~~~~ - -The :term:`cardinality` of a shard key determines the maximum number of chunks -the balancer can create. This can reduce or remove the effectiveness of -horizontal scaling in the cluster. - -A unique shard key value can exist on no more than a single chunk at any -given time. If a shard key has a cardinality of ``4``, then there can -be no more than ``4`` chunks within the sharded cluster, each storing -one unique shard key value. This constrains the number of effective -shards in the cluster to ``4`` as well - adding additional shards would -not provide any benefit. - -The following image illustrates a sharded cluster using the field -``X`` as the shard key. If ``X`` has low cardinality, the distribution of -inserts may look similar to the following: - -.. include:: /images/sharded-cluster-ranged-distribution-low-cardinal.rst - -The cluster in this example would *not* scale horizontally, as incoming writes -would only route to a subset of shards. - -A shard key with high cardinality does not guarantee even distribution of data -across the sharded cluster, though it does better facilitate horizontal -scaling. The :ref:`frequency ` and :ref:`rate of -change ` of the shard key also contributes to data -distribution. Consider each factor when choosing a shard key. - -If your data model requires sharding on a key that has low cardinality, -consider using a :term:`compound index` using a field that -has higher relative cardinality. - -.. _shard-key-frequency: - -Shard Key Frequency -~~~~~~~~~~~~~~~~~~~ - -Consider a set representing the range of shard key values - the ``frequency`` -of the shard key represents how often a given value occurs in the data. If the -majority of documents contain only a subset of those values, then the chunks -storing those documents become a bottleneck within the cluster. Furthermore, -as those chunks grow, they may become :ref:`indivisible chunks ` -as they cannot be split any further. This reduces or removes the effectiveness -of horizontal scaling within the cluster. - -The following image illustrates a sharded cluster using the field ``X`` as the -shard key. If a subset of values for ``X`` occur with high frequency, the -distribution of inserts may look similar to the following: - -.. include:: /images/sharded-cluster-ranged-distribution-frequency.rst - -A shard key with low frequency does not guarantee even distribution of data -across the sharded cluster. The :ref:`cardinality ` and -:ref:`rate of change ` of the shard key also contributes -to data distribution. Consider each factor when choosing a shard key. - -If your data model requires sharding on a key that has high frequency -values, consider using a :term:`compound index` using a unique or -low frequency value. - -.. _shard-key-monotonic: - -Monotonically Changing Shard Keys -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -A shard key on a value that increases or decreases monotonically is more -likely to distribute inserts to a single shard within the cluster. - -This occurs because every cluster has a chunk that captures a range with an -upper bound of :doc:`maxKey`. ``maxKey`` always -compares as higher than all other values. Similarly, there is a chunk that -captures a range with a lower bound of :doc:`minKey`. -``minKey`` always compares as lower than all other values. - -If the shard key value is always increasing, all new inserts are routed to the -chunk with ``maxKey`` as the upper bound. If the shard key value is always -decreasing, all new inserts are routed to the chunk with ``minKey`` as the -lower bound. The shard containing that chunk becomes the bottleneck for write -operations. - -The following image illustrates a sharded cluster using the field ``X`` -as the shard key. If the values for ``X`` are monotonically increasing, the -distribution of inserts may look similar to the following: - -.. include:: /images/sharded-cluster-monotonic-distribution.rst - -If the shard key value was monotonically decreasing, then all inserts would -route to ``Chunk A`` instead. - -A shard key that does not change monotonically does not guarantee even -distribution of data across the sharded cluster. The -:ref:`cardinality` and -:ref:`frequency` of the shard key also contributes to -data distribution. Consider each factor when choosing a shard key. - -If your data model requires sharding on a key that changes -monotonically, consider using :doc:`/core/hashed-sharding`. - -.. _update-shard-key: - -Change a Document's Shard Key Value ------------------------------------ - -.. note:: When updating the shard key value - - - You :red:`must` run on a :binary:`~bin.mongos`. Do :red:`not` - issue the operation directly on the shard. - - - You :red:`must` run either in a :doc:`transaction - ` or as a :doc:`retryable write - `. - - - You :red:`must` include an equality condition on the full shard - key in the query filter. For example, if a collection messages - uses ``{ activityid: 1, userid : 1 }`` as the shard key, to update - the shard key for a document, you must include ``activityid: , - userid: `` in the query filter. You can include additional - fields in the query as appropriate. - - See also the specific write command/methods for additional - operation-specific requirements when run against a sharded - collection. - -Starting in MongoDB 4.2, you can update a document's shard key value -unless the shard key field is the immutable ``_id`` field. To update, -use the following operations to update a document's shard key value: - -.. list-table:: - :header-rows: 1 - :widths: 40 60 - - * - Command - - Method - - * - :ref:`update ` with ``multi: false`` - - | :ref:`db.collection.replaceOne() ` - | :ref:`db.collection.updateOne() ` - | :ref:`db.collection.update() ` with ``multi: false`` - - * - :ref:`findAndModify ` - - | :ref:`db.collection.findOneAndReplace() ` - | :ref:`db.collection.findOneAndUpdate() ` - | :ref:`db.collection.findAndModify() ` - - * - - - | :method:`db.collection.bulkWrite()` - | :method:`Bulk.find.updateOne()` - - If the shard key modification results in moving the document to - another shard, you cannot specify more than one shard key - modification in the bulk operation; i.e. batch size of - 1. - - If the shard key modification does not result in moving the - document to another shard, you can specify multiple shard - key modification in the bulk operation. - -.. include:: /includes/shard-key-modification-warning.rst - .. _shard-key-missing: -Missing Shard Key ------------------ +Missing Shard Key Fields +------------------------ Starting in version 4.4, documents in sharded collections can be -missing the shard key fields. +missing the shard key fields. To set missing shard key fields, see +:ref:`shard-key-missing-set`. Chunk Range and Missing Shard Key Fields ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Missing shard keys fall within the same chunk range as shard keys with -null values. For example, if the shard key is on the fields ``{ x: 1, y: 1 -}``, then: +Missing shard key fields fall within the same chunk range as shard keys +with null values. For example, if the shard key is on the fields ``{ x: +1, y: 1 }``, then: .. list-table:: :header-rows: 1 - :widths: 60 60 + :widths: 60 60 * - Document Missing Shard Key @@ -435,13 +150,14 @@ Read/Write Operations and Missing Shard Key Fields To target documents with missing shard key fields, you can use the :query:`{ $exists: false } <$exists>` filter condition on the shard key fields. For example, if the shard key is on the fields ``{ x: 1, y: 1 -}``, to find the documents with missing shard key fields: +}``, you can find the documents with missing shard key fields by running +this query: .. code-block:: javascript db.shardedcollection.find( { $or: [ { x: { $exists: false } }, { y: { $exists: false } } ] } ) -If you specify an :doc:`null equality match +If you specify a :doc:`null equality match ` filter condition (e.g. ``{ x: null }``), the filter matches :red:`both` those documents with missing shard key fields :red:`and` those with shard key fields set to ``null``. @@ -456,92 +172,13 @@ For example: { _id: , : null } // _id of the document missing shard key -.. _shard-key-missing-set: - -Set the Missing Shard Key Fields -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -To set missing shard key fields (which is different from :ref:`changing -the value of an existing shard key field `), you can -use the following operations on a :binary:`~bin.mongos`: - -.. list-table:: - :header-rows: 1 - :widths: 20 20 60 - - * - Command - - Method - - Description - - * - | :ref:`update ` with - | ``multi: true`` - - - | :ref:`db.collection.update() ` with - | ``multi: true`` - - - - Can be used to set the missing key value to ``null`` only. - - - Can be performed inside or outside a transaction. - - - Can be performed as a retryable write or not. - - - For additional requirements, refer to the specific - command/method. - - * - | :ref:`update ` with - | ``multi: false`` - - - | :ref:`db.collection.replaceOne() ` - | :ref:`db.collection.updateOne() ` - | :ref:`db.collection.update() ` with - | ``multi: false`` - - - - Can be used to set the missing key value to ``null`` or any - other value. - - - To set to a non-``null`` value, :red:`must` be performed either - inside a transaction or as a retryable write. - - - For additional requirements, refer to the specific - command/method. - - * - :ref:`findAndModify ` - - | :ref:`db.collection.findOneAndReplace() ` - | :ref:`db.collection.findOneAndUpdate() ` - | :ref:`db.collection.findAndModify() ` - - - - Can be used to set the missing key value to ``null`` or any - other value. - - - To set to a non-``null`` value, :red:`must` be performed either - inside a transaction or as a retryable write. - - - :red:`Must` include in the query filter an equality condition - on the shard key. Since a missing key value is returned as part - of a ``null`` equality match, to avoid updating a null-valued - key, include additional query conditions as - appropriate. - - - For additional requirements, refer to the specific - command/method. - - * - - - | :method:`db.collection.bulkWrite()` - | :method:`Bulk.find.replaceOne()` - | :method:`Bulk.find.updateOne()` - | :method:`Bulk.find.update()` - - - - To set to a ``null`` value, you can specify multiple shard - key modifications in the bulk operation. - - - To set to a non-``null`` value, you can specify a single shard - key modification in the bulk operation; i.e. batch size of 1. - - - To set to a non-``null`` value, :red:`must` be performed either - inside a transaction or as a retryable write. - - - For additional requirements, refer to the underlying - command/method. +.. toctree:: + :titlesonly: -Once you set the shard key field, to modify the field's value, -see :ref:`update-shard-key`. + /core/sharding-shard-a-collection.txt + /core/sharding-choose-a-shard-key.txt + /core/sharding-change-a-shard-key.txt + /core/sharding-change-shard-key-value.txt + /core/sharding-set-missing-shard-key-fields.txt + /core/sharding-find-shard-key.txt + /core/sharding-troubleshooting-shard-keys.txt diff --git a/source/core/sharding-troubleshooting-shard-keys.txt b/source/core/sharding-troubleshooting-shard-keys.txt new file mode 100644 index 00000000000..33220a342ea --- /dev/null +++ b/source/core/sharding-troubleshooting-shard-keys.txt @@ -0,0 +1,110 @@ +.. _shardkey-troubleshoot-shard-keys: + +======================= +Troubleshoot Shard Keys +======================= + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + + +The ideal shard key allows MongoDB to distribute documents evenly +throughout the cluster while facilitating common query patterns. A +suboptimal shard key can lead to the following problems: + +- :ref:`Jumbo chunks ` +- :ref:`Uneven load distribution + ` +- :ref:`Decreased query performance over time + ` + +In the following you can find out more about common problems with shard +keys and how to resolve them. + +.. _sharding-troubleshooting-jumbo-chunks: + +Jumbo Chunks +------------ + +If you are seeing :ref:`jumbo chunks `, either the +:ref:`cardinality ` of your shard key is +insufficient or the :ref:`frequency` of the shard +key values is unevenly distributed. + +To increase the cardinality of your shard key or change the distribution +of your shard key values, you can: + +- :ref:`refine your shard key ` by adding a suffix + field or fields to the existing key to increase cardinality +- :ref:`reshard your collection ` using a + different shard key with higher cardinality + +To only change the distribution of your shard key values, you can also +consider using :doc:`/core/hashed-sharding` to distribute your data more +evenly. + +For advice on choosing a shard key see +:ref:`sharding-shard-key-selection`. + +.. _sharding-troubleshooting-monotonicity: + +Uneven Load Distribution +------------------------ + +If your cluster is experiencing uneven load distribution, check if your +shard key increases :ref:`monotonically `. A shard +key that is a monotonically increasing field, leads to an uneven read +and write distribution. + +Consider an ``orders`` collection that is sharded on an ``order_id`` +field. The ``order_id`` is an integer which increases by one with each +order. + +- New documents are generally written to the same shard and chunk. The + shard and chunk that receive the writes are called *hot* shard and + *hot* chunk. The *hot* shard changes over time. When chunks are split, + the hot chunk moves to a different shard to optimize data + distribution. + +- If users are more likely to interact with recent orders, which are all + on the same shard, the shard that contains recent orders will receive + most of the traffic. + +If you have a monotonically increasing shard key, consider +:ref:`resharding your collection `. For advice on +choosing a shard key see :ref:`sharding-shard-key-selection`. + +If your data model requires sharding on a key that changes +monotonically, consider using :doc:`/core/hashed-sharding`. + + +.. _sharding-troubleshooting-scatter-gather: + +Decreased Query Performance Over Time +------------------------------------- + +If you are noticing decreased query performance over time, it is +possible that your cluster is performing :ref:`scatter-gather queries +`. + +To evaluate if your cluster is performing scatter-gather queries, check +if your most common queries include the shard key. + +If you include the shard key in your queries, check if your shard key is +hashed. With :doc:`/core/hashed-sharding`, documents are not stored in +ascending or descending order of the shard key field value. Performing +range based queries on the shard key value on data that is not stored in +ascending or descending order results in less performant scatter-gather +queries. If range based queries on your shard key are a common access +pattern, consider :ref:`resharding your collection +`. + +If you do not include the shard key in your most common queries, it is +possible that you could increase performance by :ref:`resharding your +collection `. For advice on choosing a shard key +see :ref:`sharding-shard-key-selection`. diff --git a/source/faq/diagnostics.txt b/source/faq/diagnostics.txt index 040b2ffb8b4..47ac6912a87 100644 --- a/source/faq/diagnostics.txt +++ b/source/faq/diagnostics.txt @@ -149,11 +149,10 @@ The two most important factors in maintaining a successful sharded cluster are: - :ref:`sufficient capacity to support current and future operations `. -You can prevent most issues encountered with sharding by ensuring that -you choose the best possible :term:`shard key` for your deployment and -ensure that you are always adding additional capacity to your cluster -well before the current resources become saturated. Continue reading -for specific issues you may encounter in a production environment. +While you can :ref:`change your shard key ` later, +it is important to carefully consider your shard key choice to avoid +scalability and perfomance issues. Continue reading for specific issues +you may encounter in a production environment. .. _sharding-troubleshooting-not-splitting: @@ -198,9 +197,9 @@ It's also possible that you have "hot chunks." In this case, you may be able to solve the problem by splitting and then migrating parts of these chunks. -In the worst case, you may have to consider re-sharding your data -and :ref:`choosing a different shard key ` -to correct this pattern. +You may have to consider :ref:`resharding your collection +` with a :ref:`different shard key +` to correct this pattern. What can prevent a sharded cluster from balancing? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -260,5 +259,7 @@ consider the following options, depending on the nature of the impact: It's also possible that your shard key causes your application to direct all writes to a single shard. This kind of activity pattern can require the balancer to migrate most data soon after writing -it. Consider redeploying your cluster with a shard key that provides -better :ref:`write scaling `. +it. You may have to consider :ref:`resharding your collection +` with a :ref:`different shard key +` that provides better :ref:`write +scaling `. diff --git a/source/faq/sharding.txt b/source/faq/sharding.txt index cfef5a78a02..7696e3ec94b 100644 --- a/source/faq/sharding.txt +++ b/source/faq/sharding.txt @@ -23,7 +23,7 @@ the :doc:`/sharding` section in the manual, which provides an - :doc:`/core/sharding-data-partitioning` and :doc:`Chunk Migration Process ` - + - :doc:`/tutorial/troubleshoot-sharded-clusters` Is sharding appropriate for a new deployment? @@ -38,12 +38,19 @@ set is small provides *little advantage* . Can I select a different shard key after sharding a collection? --------------------------------------------------------------- -No. +Your options for changing a shard key depend on the version of MongoDB +that you are running: + +- Starting in MongoDB 5.0, you can :ref:`reshard a collection + ` by changing a document's shard key. +- Starting in MongoDB 4.4, you can :ref:`refine a shard key + ` by adding a suffix field or fields to the + existing shard key. +- In MongoDB 4.2 and earlier, the choice of shard key cannot be changed + after sharding. -There is no automatic support in MongoDB for choosing a different shard key -after sharding a collection. This reality underscores -the importance of choosing a good :ref:`shard key `. If you -*must* change a shard key after sharding a collection, the best option is to: +In MongoDB 4.2 and earlier, if you *must* change a shard key after +sharding a collection and cannot upgrade, the best option is to: - dump all data from MongoDB into an external format. @@ -56,14 +63,6 @@ the importance of choosing a good :ref:`shard key `. If you - restore the dumped data into MongoDB. -Although you cannot select a different shard key for a sharded -collection, starting in MongoDB 4.2, you can update a document's shard -key value unless the shard key field is the immutable ``_id`` field. -For details on updating the shard key values, see -:ref:`update-shard-key`. - -Before MongoDB 4.2, a document's shard key field value is immutable. - .. seealso:: :doc:`/core/sharding-shard-key` diff --git a/source/includes/limits-sharding-shardkey-document-immutable.rst b/source/includes/limits-sharding-shardkey-document-immutable.rst index 4a1e4de157c..58c7c204825 100644 --- a/source/includes/limits-sharding-shardkey-document-immutable.rst +++ b/source/includes/limits-sharding-shardkey-document-immutable.rst @@ -1,3 +1,4 @@ Starting in MongoDB 4.2, you can update a document's shard key value -unless the shard key field is the immutable ``_id`` field. Before -MongoDB 4.2, a document's shard key field value is immutable. \ No newline at end of file +unless the shard key field is the immutable ``_id`` field. In +MongoDB 4.2 and earlier, a document's shard key field value is +immutable. \ No newline at end of file diff --git a/source/includes/limits-sharding-shardkey-immutable.rst b/source/includes/limits-sharding-shardkey-immutable.rst index 3e0b3b0779f..c163b09e5d7 100644 --- a/source/includes/limits-sharding-shardkey-immutable.rst +++ b/source/includes/limits-sharding-shardkey-immutable.rst @@ -1,4 +1,4 @@ -If you must change a shard key: +In MongoDB 4.2 and earlier, to change a shard key: - Dump all data from MongoDB into an external format. diff --git a/source/includes/limits-sharding-shardkey-monotonic-throughput.rst b/source/includes/limits-sharding-shardkey-monotonic-throughput.rst index 4fc78781bb8..b6232099131 100644 --- a/source/includes/limits-sharding-shardkey-monotonic-throughput.rst +++ b/source/includes/limits-sharding-shardkey-monotonic-throughput.rst @@ -1,4 +1,4 @@ -For clusters with high insert volumes, a shard keys with +For clusters with high insert volumes, a shard key with monotonically increasing and decreasing keys can affect insert throughput. If your shard key is the ``_id`` field, be aware that the default values of the ``_id`` fields are :term:`ObjectIds diff --git a/source/includes/steps-reshard-a-collection.yaml b/source/includes/steps-reshard-a-collection.yaml new file mode 100644 index 00000000000..3be2c10a69b --- /dev/null +++ b/source/includes/steps-reshard-a-collection.yaml @@ -0,0 +1,192 @@ +title: Start the resharding operation. +level: 4 +ref: resharding-start +content: | + + While connected to the :binary:`~bin.mongos`, issue a + :dbcommand:`reshardCollection` command that specifies the collection + to be resharded and the new shard key: + + .. code-block:: javascript + + db.adminCommand({ + reshardCollection: ".", + key: + }) + + MongoDB sets the max number of seconds to block writes to two seconds + and begins the resharding operation. +--- +title: Monitor the resharding operation. +level: 4 +ref: resharding-monitor +content: | + + To monitor the resharding operation, you can use the + :pipeline:`$currentOp` pipeline stage: + + .. code-block:: javascript + + db.getSiblingDB("admin").aggregate([ + { $currentOp: { localOps: false } }, + { + $match: { + type: "op", + "originatingCommand.reshardCollection": "." + } + } + ]) + + .. note:: + + To see updated values, you need to continuously run the + preceeding pipeline. + + The :pipeline:`$currentOp` pipeline outputs: + + - ``totalOperationTimeElapsedSecs``: elapsed operation time in + seconds + - ``remainingOperationTimeEstimatedSecs``: estimate of the remaining + time to complete the resharding operation + + .. code-block:: javascript + + [ + { + shard: '', + type: 'op', + desc: 'ReshardingRecipientService | ReshardingDonorService | ReshardingCoordinatorService ', + op: 'command', + ns: '.', + originatingCommand: { + reshardCollection: '.', + key: , + unique: , + collation: { locale: 'simple' } + }, + totalOperationTimeElapsedSecs: , + remainingOperationTimeEstimatedSecs: , + ... + }, + ... + ] + +--- +title: Finish the resharding operation. +level: 4 +ref: resharding-finish +content: | + + Throughout the resharding process, the estimated time to complete the + resharding operation (``remainingOperationTimeEstimatedSecs``) + decreases. When the estimated time is below **two seconds**, MongoDB + blocks writes and completes the resharding operation. Until the + estimated time to complete the resharing operation is below two + seconds, the resharding operation does not block writes by default. + During the time period where writes are blocked your application + experiences an increase in latency. + + Once the resharding process has completed, the resharding command + returns ``ok: 1``. + + .. code-block:: javascript + + { + ok: 1, + '$clusterTime': { + clusterTime: , + signature: { + hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0), + keyId: + } + }, + operationTime: + } + + To see whether the resharding operation completed successfully, check + the output of the :method:`sh.status()` method: + + .. code-block:: javascript + + sh.status() + + The :method:`sh.status()` method output contains a subsection for the + ``databases``. If resharding has completed successfully, the output + lists the new shard key for the collection: + + .. code-block:: javascript + + databases + [ + { + database: { + _id: '', + primary: '', + partitioned: true, + version: { + uuid: , + timestamp: , + lastMod: + } + }, + collections: { + '.': { + shardKey: , + unique: , + balancing: , + chunks: [], + tags: [] + } + } + } + ... + ] + + .. note:: + + If the resharded collection uses :atlas:`Atlas Search + `, the search index will become unavailable when + the resharding operation completes. You need to manually rebuild + the search index once the resharding operation completes. + + Block writes early to force resharding to complete + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + You can manually force the resharding operation to complete by + issuing the :dbcommand:`commitReshardCollection` command. This is + useful if the current time estimate to complete the resharding + operation is an acceptable duration for your collection to block + writes. The :dbcommand:`commitReshardCollection` command blocks + writes early and forces the resharding operation to complete. The + command has the following syntax: + + .. code-block:: javascript + + db.adminCommand({ + commitReshardCollection: "." + }) + + Abort resharding operation + ~~~~~~~~~~~~~~~~~~~~~~~~~~ + + You can abort the resharding operation during any stage of the + resharding operation, even after running the + :dbcommand:`commitReshardCollection`, until shards have fully caught + up. + + For example, if the write unavailability duration estimate does not + decrease, you can abort the resharding operation with the + :dbcommand:`abortReshardCollection` command: + + .. code-block:: javascript + + db.adminCommand({ + abortReshardCollection: "." + }) + + After canceling the operation, you can retry the resharding + operation during a time window with lower write volume. If this is + not possible, :ref:`add more shards ` + before retrying. + +... diff --git a/source/includes/steps-shard-a-collection-ranged.yaml b/source/includes/steps-shard-a-collection-ranged.yaml index 1626bbf346b..6c2dc839902 100644 --- a/source/includes/steps-shard-a-collection-ranged.yaml +++ b/source/includes/steps-shard-a-collection-ranged.yaml @@ -57,23 +57,29 @@ title: "Determine the Shard Key" level: 4 ref: select-shard-key pre: | - Determine your :term:`shard key`. Your selection of shard key affects the + Determine your :term:`shard key`. Your selection of shard key affects the efficiency of sharding. See the selection considerations listed under :ref:`sharding-shard-key-selection` - - .. warning:: - You cannot change a shard key once you have enabled sharding - for a collection. + + .. note:: + + - Starting in MongoDB 5.0, you can :ref:`reshard a collection + ` by changing a document's shard key. + - Starting in MongoDB 4.4, you can :ref:`refine a shard key + ` by adding a suffix field or fields to the + existing shard key. + - In MongoDB 4.2 and earlier, the choice of shard key cannot be + changed after sharding. --- title: "Create the Shard Key Index" level: 4 ref: create-shard-key pre: | - If the collection already contains data, you must create an index on the + If the collection already contains data, you must create an index on the :term:`shard key` using the :method:`db.collection.createIndex()` method. - + If the collection is empty, then MongoDB creates the index as part of - the :method:`sh.shardCollection()` method. + the :method:`sh.shardCollection()` method. --- title: "Shard the Collection" level: 4 @@ -82,9 +88,9 @@ pre: | Shard a collection by issuing the :method:`sh.shardCollection()` method. action: - pre: | - The :method:`sh.shardCollection()` method takes the full namespace of + The :method:`sh.shardCollection()` method takes the full namespace of the collection in ``.`` format, and a document - containing the shard key pattern. + containing the shard key pattern. language: javascript code: | sh.shardCollection( ".", { : } ) diff --git a/source/reference/command.txt b/source/reference/command.txt index ef1cc20c885..085baf5cd5f 100644 --- a/source/reference/command.txt +++ b/source/reference/command.txt @@ -403,6 +403,12 @@ Sharding Commands - Description + * - :dbcommand:`abortReshardCollection` + + - Aborts a :ref:`resharding operation `. + + .. versionadded:: 5.0 + * - :dbcommand:`addShard` - Adds a :term:`shard` to a :term:`sharded cluster`. @@ -415,7 +421,7 @@ Sharding Commands - Returns information on whether the chunks of a sharded collection are balanced. - + .. versionadded:: 4.4 * - :dbcommand:`balancerStart` @@ -442,6 +448,19 @@ Sharding Commands - Removes orphaned data with shard key values outside of the ranges of the chunks owned by a shard. + * - :dbcommand:`cleanupReshardCollection` + + - Cleans up a failed :ref:`resharding operation `. + + .. versionadded:: 5.0 + + * - :dbcommand:`commitReshardCollection` + + - Forces a :ref:`resharding operation ` to + block writes and complete. + + .. versionadded:: 5.0 + * - :dbcommand:`enableSharding` - Enables sharding on a specific database. @@ -497,6 +516,13 @@ Sharding Commands - Removes the association between a shard and a :term:`zone`. Supports configuring :ref:`zones ` in sharded clusters. + * - :dbcommand:`reshardCollection` + + - Initiates a :ref:`resharding operation ` to change the + shard key for a collection, changing the distribution of your data. + + .. versionadded:: 5.0 + * - :dbcommand:`setShardVersion` - Internal command to sets the :term:`config server ` version. diff --git a/source/reference/command/abortReshardCollection.txt b/source/reference/command/abortReshardCollection.txt new file mode 100644 index 00000000000..f28333200d1 --- /dev/null +++ b/source/reference/command/abortReshardCollection.txt @@ -0,0 +1,60 @@ +====================== +abortReshardCollection +====================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 1 + :class: singlecol + +Definition +---------- + +.. dbcommand:: abortReshardCollection + + .. versionadded:: 5.0 + + During a :ref:`resharding operation `, you can + abort the operation with the :dbcommand:`abortReshardCollection` + command. + + The command has the following syntax: + + .. code-block:: javascript + + { + abortReshardCollection: "." + } + + You can abort a :ref:`resharding operation ` at + any point until the :ref:`commit phase + `. If the :ref:`resharding operation + ` has reached the :ref:`commit phase + ` before you run the + :dbcommand:`abortReshardCollection` command, the command returns an + error. + + The :mongosh:`MongoDB Shell ` provides a wrapper method + :method:`sh.abortReshardCollection()`. + +Example +------- + +Abort a Resharding Operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following example aborts a running :ref:`resharding operation +` on the ``sales.orders`` collection: + +.. code-block:: javascript + + db.adminCommand({ + abortReshardCollection: "sales.orders" + }) + +.. seealso:: + + :ref:`sharding-resharding` diff --git a/source/reference/command/cleanupReshardCollection.txt b/source/reference/command/cleanupReshardCollection.txt new file mode 100644 index 00000000000..b4609758b15 --- /dev/null +++ b/source/reference/command/cleanupReshardCollection.txt @@ -0,0 +1,51 @@ +======================== +cleanupReshardCollection +======================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 1 + :class: singlecol + +Definition +---------- + +.. dbcommand:: cleanupReshardCollection + + .. versionadded:: 5.0 + + The :dbcommand:`cleanupReshardCollection` command cleans up metadata + of a failed :ref:`resharding operation `. You + only need to run this command if a primary failover occurred while you + ran a resharding operation. + + The command has the following syntax: + + .. code-block:: javascript + + { + cleanupReshardCollection: "." + } + + +Example +------- + +Clean up a Failed Resharding Operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following example cleans up metadata of a failed :ref:`resharding +operation ` on the ``sales.orders`` collection: + +.. code-block:: javascript + + db.adminCommand({ + cleanupReshardCollection: "sales.orders" + }) + +.. seealso:: + + :ref:`sharding-resharding` diff --git a/source/reference/command/commitReshardCollection.txt b/source/reference/command/commitReshardCollection.txt new file mode 100644 index 00000000000..79335fc8408 --- /dev/null +++ b/source/reference/command/commitReshardCollection.txt @@ -0,0 +1,58 @@ +======================= +commitReshardCollection +======================= + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 1 + :class: singlecol + +Definition +---------- + +.. dbcommand:: commitReshardCollection + + .. versionadded:: 5.0 + + During a resharding operation, MongoDB does not block writes until + the estimated duration to complete the resharding operation is + below **two seconds**. + + If the current estimate is above two seconds but the time frame is + acceptable to you, you can finish resharding faster. The + :dbcommand:`commitReshardCollection` command blocks writes early and + forces the resharding operation to complete. + + The command has the following syntax: + + .. code-block:: javascript + + { + commitReshardCollection: "." + } + + The :mongosh:`MongoDB Shell ` provides a wrapper method + :method:`sh.commitReshardCollection()`. + +Example +------- + +Commit a Resharding Operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following command forces the :ref:`resharding +operation ` on the ``sales.orders`` to block writes +and complete: + +.. code-block:: javascript + + db.adminCommand({ + commitReshardCollection: "sales.orders" + }) + +.. seealso:: + + :ref:`sharding-resharding` diff --git a/source/reference/command/nav-sharding.txt b/source/reference/command/nav-sharding.txt index 7cd64eac808..90b650dbdd0 100644 --- a/source/reference/command/nav-sharding.txt +++ b/source/reference/command/nav-sharding.txt @@ -20,6 +20,12 @@ Sharding Commands - Description + * - :dbcommand:`abortReshardCollection` + + - Aborts a :ref:`resharding operation `. + + .. versionadded:: 5.0 + * - :dbcommand:`addShard` - Adds a :term:`shard` to a :term:`sharded cluster`. @@ -32,7 +38,7 @@ Sharding Commands - Returns information on whether the chunks of a sharded collection are balanced. - + .. versionadded:: 4.4 * - :dbcommand:`balancerStart` @@ -59,6 +65,19 @@ Sharding Commands - Removes orphaned data with shard key values outside of the ranges of the chunks owned by a shard. + * - :dbcommand:`cleanupReshardCollection` + + - Cleans up a failed :ref:`resharding operation `. + + .. versionadded:: 5.0 + + * - :dbcommand:`commitReshardCollection` + + - Forces a :ref:`resharding operation ` to + block writes and complete. + + .. versionadded:: 5.0 + * - :dbcommand:`enableSharding` - Enables sharding on a specific database. @@ -114,6 +133,13 @@ Sharding Commands - Removes the association between a shard and a :term:`zone`. Supports configuring :ref:`zones ` in sharded clusters. + * - :dbcommand:`reshardCollection` + + - Initiates a :ref:`resharding operation ` to change the + shard key for a collection, changing the distribution of your data. + + .. versionadded:: 5.0 + * - :dbcommand:`setShardVersion` - Internal command to sets the :term:`config server ` version. @@ -149,9 +175,10 @@ Sharding Commands .. toctree:: - :titlesonly: - :hidden: + :titlesonly: + :hidden: + /reference/command/abortReshardCollection /reference/command/addShard /reference/command/addShardToZone /reference/command/balancerCollectionStatus @@ -161,6 +188,8 @@ Sharding Commands /reference/command/checkShardingIndex /reference/command/clearJumboFlag /reference/command/cleanupOrphaned + /reference/command/cleanupReshardCollection + /reference/command/commitReshardCollection /reference/command/enableSharding /reference/command/flushRouterConfig /reference/command/getShardMap @@ -174,6 +203,7 @@ Sharding Commands /reference/command/refineCollectionShardKey /reference/command/removeShard /reference/command/removeShardFromZone + /reference/command/reshardCollection /reference/command/setShardVersion /reference/command/shardCollection /reference/command/shardingState diff --git a/source/reference/command/reshardCollection.txt b/source/reference/command/reshardCollection.txt new file mode 100644 index 00000000000..9dcb9107387 --- /dev/null +++ b/source/reference/command/reshardCollection.txt @@ -0,0 +1,236 @@ +================= +reshardCollection +================= + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 1 + :class: singlecol + +Definition +---------- + +.. dbcommand:: reshardCollection + + .. versionadded:: 5.0 + + The :dbcommand:`reshardCollection` command changes the shard key for + a collection and changes the distribution of your data. + + The command has the following syntax: + + .. code-block:: javascript + + { + reshardCollection: ".", + key: , + unique: , + numInitialChunks: , + collation: { locale: "simple" }, + zones: [ + { + min: , + max: , + zone: | null + }, + ... + ] + } + + The command takes the following fields: + + .. list-table:: + :header-rows: 1 + :widths: 25 20 75 + + * - Field + + - Type + + - Description + + * - ``reshardCollection`` + + - string + + - The :term:`namespace` of the collection to be resharded. Takes + the form ``.``. + + * - ``key`` + + - document + + - The document that specifies the new field or fields to use as the + :doc:`shard key `. + + ``{ : <1|"hashed">, ... }`` + + Set the field values to either: + + - ``1`` for :doc:`ranged based sharding ` + + - ``"hashed"`` to specify a + :ref:`hashed shard key `. + + See also :ref:`sharding-shard-key-indexes` + + * - ``unique`` + + - boolean + + - Optional. Specify whether there is a :doc:`uniqueness + ` constraint on the shard key. Only + ``false`` is supported. Defaults to ``false``. + + * - ``numInitialChunks`` + + - integer + + - Optional. Specifies the initial number of chunks to create + across all shards in the cluster when resharding a collection. + The default is the number of chunks that exist for the + collection under the current shard key pattern. MongoDB will + then create and balance chunks across the cluster. The + ``numInitialChunks`` must result in less than ``8192`` per shard. + + * - ``collation`` + + - document + + - Optional. If the collection specified to ``reshardCollection`` + has a default :doc:`collation `, + you *must* include a collation document with + ``{ locale : "simple" }``, or the ``reshardCollection`` + command fails. + + * - ``zones`` + + - array + + - Optional. To maintain or add :ref:`zones `, + specify the zones for your collection in an array. + + The :mongosh:`MongoDB Shell ` provides a wrapper method + :method:`sh.reshardCollection()`. + +Resharding Process +------------------ + +During the resharding process, there are two roles a shard may play: + +- **Donors** are shards which currently own chunks of the sharded + collection. +- **Recipients** are shards which would own chunks of the sharded + collection according to the new shard key and zones. + +A shard may play both the role of a donor and a recipient concurrently. +Unless zones are being used, the set of donor shards is the same as the +set of recipient shards. + +The config server primary is always chosen as the resharding +coordinator, responsible for initiating each phase of the process. + +Initialization Phase +~~~~~~~~~~~~~~~~~~~~ + +During the initialization phase: + +- The balancer determines the new data distribution for the sharded + collection. + +Index Phase +~~~~~~~~~~~ + +During the index phase: + +- Each shard recipient creates a new, empty sharded collection with the + same collection options as the existing sharded collection. This new + sharded collection is the target for where recipient shards write the + new data. +- Each shard recipient builds the necessary new indexes. These include + all existing indexes on the sharded collection and an index compatible + with the new shard key pattern if such an index doesn’t already exist on + the sharded collection. + +Clone, Apply, and Catch-up Phase +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +During the clone, apply, and catch-up phase: + +- Each shard recipient clones an initial copy of the documents it would + own under the new shard key. +- Each shard recipient begins applying oplog entries from operations + that happened after the recipient cloned the data. +- When the estimate for the time remaining to complete the resharding + operation is under **two seconds**, the resharding coordinator blocks + writes for the collection. + + .. note:: + + If desired, you can manually force the resharding operation to + complete by issuing the :dbcommand:`commitReshardCollection` + command. This is useful if the current time estimate to complete + the resharding operation is an acceptable duration for your + collection to block writes. The + :dbcommand:`commitReshardCollection` command blocks writes early + and forces the resharding operation to complete. During the time + period where writes are blocked your application experiences an + increase in latency. + +.. _resharding-commit-phase-command: + +Commit Phase +~~~~~~~~~~~~ + +- Once the resharding process reaches the commit phase, it may no longer + be aborted with :dbcommand:`abortReshardCollection`. +- When all shards have reached strict consistency, the resharding + coordinator commits the resharding operation and installs the new + routing table. +- The resharding coordinator instructs each donor and recipient shard + primary, independently, to rename the temporary sharded collection. + The temporary collection becomes the new resharded collection. +- Each donor shard drops the old sharded collection. + + .. seealso:: + + :ref:`sharding-resharding` + +Example +------- + +Reshard a Collection +~~~~~~~~~~~~~~~~~~~~ + +The following example reshards the ``sales.orders`` collection with the +new shard key ``{ order_id: 1 }``: + +.. code-block:: javascript + + db.adminCommand({ + reshardCollection: "sales.orders", + key: { order_id: 1 } + }) + +MongoDB returns the following: + +.. code-block:: javascript + + { + ok: 1, + '$clusterTime': { + clusterTime: Timestamp(1, 1624887954), + signature: { + hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0), + keyId: 0 + } + }, + operationTime: Timestamp(1, 1624887947) + } + +.. seealso:: + + :ref:`sharding-resharding` diff --git a/source/reference/command/shardCollection.txt b/source/reference/command/shardCollection.txt index 51938e1baaa..b4c7489cd16 100644 --- a/source/reference/command/shardCollection.txt +++ b/source/reference/command/shardCollection.txt @@ -177,8 +177,6 @@ Considerations Use ~~~ -You can only shard a collection once. - Do **not** run more than one :dbcommand:`shardCollection` command on the same collection at the same time. @@ -187,16 +185,14 @@ the same collection at the same time. Shard Keys ~~~~~~~~~~ -Choosing the best shard key to effectively distribute load among your -shards requires some planning. +While you can :ref:`change your shard key ` +later, it is important to carefully consider your shard key choice to +avoid scalability and perfomance issues. -- Starting in MongoDB 4.4, you can refine a collection's shard key by - adding a suffix field or fields to the existing key. - -- Starting in MongoDB 4.2, you can update a document's shard key value - (unless the shard key field is the immutable ``_id`` field). +.. seealso:: -For more information, see :ref:`sharding-shard-key`. + - :ref:`sharding-shard-key-selection` + - :ref:`sharding-shard-key` .. _hashed-shard-keys: diff --git a/source/reference/limits.txt b/source/reference/limits.txt index d5aa20d7a10..ed0b2a2371d 100644 --- a/source/reference/limits.txt +++ b/source/reference/limits.txt @@ -319,16 +319,16 @@ Shard Key Limitations .. limit:: Shard Key Selection is Immutable in MongoDB 4.2 and Earlier - .. note:: Changed in Version 4.4 - - - Starting in MongoDB 4.4, you can refine a collection's shard key by - adding a suffix field or fields to the existing key. See - :dbcommand:`refineCollectionShardKey`. - - In MongoDB 4.2 and earlier, once you shard a collection, the - selection of the shard key is immutable; i.e. you cannot select a - different shard key for that collection. + Your options for changing a shard key depend on the version of + MongoDB that you are running: + + - Starting in MongoDB 5.0, you can :ref:`reshard a collection + ` by changing a document's shard key. + - Starting in MongoDB 4.4, you can :ref:`refine a shard key + ` by adding a suffix field or fields to the + existing shard key. + - In MongoDB 4.2 and earlier, the choice of shard key cannot be + changed after sharding. .. include:: /includes/limits-sharding-shardkey-immutable.rst diff --git a/source/reference/method.txt b/source/reference/method.txt index 1aac8f1ee2d..b6ac16b5f29 100644 --- a/source/reference/method.txt +++ b/source/reference/method.txt @@ -931,6 +931,12 @@ Sharding - Description + * - :method:`sh.abortReshardCollection()` + + - Aborts a :ref:`resharding operation `. + + .. versionadded:: 5.0 + * - :method:`sh.addShard()` - Adds a :term:`shard` to a sharded cluster. @@ -948,12 +954,19 @@ Sharding - In MongoDB 3.4, this method aliases to :method:`sh.updateZoneKeyRange()`. * - :method:`sh.balancerCollectionStatus()` - + - Returns information on whether the chunks of a sharded collection are balanced. - + .. versionadded:: 4.4 + * - :method:`sh.commitReshardCollection()` + + - Forces a :ref:`resharding operation ` to + block writes and complete. + + .. versionadded:: 5.0 + * - :method:`sh.disableBalancing()` - Disable balancing on a single collection in a sharded database. Does not affect balancing of other collections in a sharded cluster. @@ -1010,6 +1023,13 @@ Sharding - Removes the association between a shard and a zone. Use to manage :ref:`zone sharding `. + * - :method:`sh.reshardCollection()` + + - Initiates a :ref:`resharding operation ` to change the + shard key for a collection, changing the distribution of your data. + + .. versionadded:: 5.0 + * - :method:`sh.setBalancerState()` - Enables or disables the :term:`balancer` which migrates :term:`chunks ` between :term:`shards `. diff --git a/source/reference/method/js-sharding.txt b/source/reference/method/js-sharding.txt index df2026fc6b9..b0a29f76cd5 100644 --- a/source/reference/method/js-sharding.txt +++ b/source/reference/method/js-sharding.txt @@ -20,6 +20,12 @@ Sharding Methods - Description + * - :method:`sh.abortReshardCollection()` + + - Aborts a :ref:`resharding operation `. + + .. versionadded:: 5.0 + * - :method:`sh.addShard()` - Adds a :term:`shard` to a sharded cluster. @@ -43,6 +49,13 @@ Sharding Methods .. versionadded:: 4.4 + * - :method:`sh.commitReshardCollection()` + + - Forces a :ref:`resharding operation ` to + block writes and complete. + + .. versionadded:: 5.0 + * - :method:`sh.disableBalancing()` - Disable balancing on a single collection in a sharded database. Does not affect balancing of other collections in a sharded cluster. @@ -99,6 +112,13 @@ Sharding Methods - Removes the association between a shard and a zone. Use to manage :ref:`zone sharding `. + * - :method:`sh.reshardCollection()` + + - Initiates a :ref:`resharding operation ` to change the + shard key for a collection, changing the distribution of your data. + + .. versionadded:: 5.0 + * - :method:`sh.setBalancerState()` - Enables or disables the :term:`balancer` which migrates :term:`chunks ` between :term:`shards `. @@ -149,14 +169,16 @@ Sharding Methods .. toctree:: - :titlesonly: - :hidden: + :titlesonly: + :hidden: + /reference/method/sh.abortReshardCollection /reference/method/sh.addShard /reference/method/sh.addShardTag /reference/method/sh.addShardToZone /reference/method/sh.addTagRange /reference/method/sh.balancerCollectionStatus + /reference/method/sh.commitReshardCollection /reference/method/sh.disableBalancing /reference/method/sh.enableBalancing /reference/method/sh.disableAutoSplit @@ -171,6 +193,7 @@ Sharding Methods /reference/method/sh.moveChunk /reference/method/sh.removeShardTag /reference/method/sh.removeShardFromZone + /reference/method/sh.reshardCollection /reference/method/sh.setBalancerState /reference/method/sh.shardCollection /reference/method/sh.splitAt diff --git a/source/reference/method/sh.abortReshardCollection.txt b/source/reference/method/sh.abortReshardCollection.txt new file mode 100644 index 00000000000..d53a27763ec --- /dev/null +++ b/source/reference/method/sh.abortReshardCollection.txt @@ -0,0 +1,83 @@ +=========================== +sh.abortReshardCollection() +=========================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 1 + :class: singlecol + +Definition +---------- + +.. method:: sh.abortReshardCollection(namespace) + + .. versionadded:: 5.0 + + During a :ref:`resharding operation `, you can + abort the operation with the :method:`sh.abortReshardCollection()` + method. + + You can abort a :ref:`resharding operation ` at + any point until the :ref:`commit phase + `. If the + :ref:`resharding operation ` has reached the + :ref:`commit phase ` before you run + the :method:`sh.abortReshardCollection()` method, the method returns + an error. + + The :mongosh:`MongoDB Shell ` method + :method:`sh.abortReshardCollection()` wraps the + :dbcommand:`abortReshardCollection` command. + +Syntax +------ + +The :method:`sh.abortReshardCollection()` method has the following syntax: + +.. code-block:: javascript + + sh.abortReshardCollection( ) + +Parameter +~~~~~~~~~ + +The :method:`sh.abortReshardCollection()` method takes the following +parameter: + +.. list-table:: + :header-rows: 1 + :widths: 20 20 60 + + * - Parameter + - Type + - Description + + * - :ref:`namespace ` + + - String + + - .. _method-abortReshardCollection-namespace: + + The name of the collection to shard in the form + ``"."``. + +Example +------- + +Abort a Resharding Operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following example aborts a running :ref:`resharding operation +` on the ``sales.orders`` collection: + +.. code-block:: javascript + + sh.abortReshardCollection("sales.orders") + +.. seealso:: + + :ref:`sharding-resharding` diff --git a/source/reference/method/sh.commitReshardCollection.txt b/source/reference/method/sh.commitReshardCollection.txt new file mode 100644 index 00000000000..6525dd31ac7 --- /dev/null +++ b/source/reference/method/sh.commitReshardCollection.txt @@ -0,0 +1,82 @@ +============================ +sh.commitReshardCollection() +============================ + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 1 + :class: singlecol + +Definition +---------- + +.. method:: sh.commitReshardCollection(namespace) + + .. versionadded:: 5.0 + + During a resharding operation, MongoDB does not block writes until + the estimated duration to complete the resharding operation is + below **two seconds**. + + If the current estimate is above two seconds but the time frame is + acceptable to you, you can finish resharding faster. The + :method:`sh.commitReshardCollection()` method blocks writes early and + forces the resharding operation to complete. + + The :mongosh:`MongoDB Shell ` method + :method:`sh.commitReshardCollection()` wraps the + :dbcommand:`commitReshardCollection` command. + +Syntax +------ + +The :method:`sh.commitReshardCollection()` method has the following +syntax: + +.. code-block:: javascript + + sh.commitReshardCollection( ) + +Parameter +~~~~~~~~~ + +The :method:`sh.commitReshardCollection()` method takes the following +parameter: + +.. list-table:: + :header-rows: 1 + :widths: 20 20 60 + + * - Parameter + - Type + - Description + + * - :ref:`namespace ` + + - String + + - .. _method-commitReshardCollection-namespace: + + The name of the collection to shard in the form + ``"."``. + +Example +------- + +Commit a Resharding Operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following command forces the :ref:`resharding operation +` on the ``sales.orders`` to block writes and +complete: + +.. code-block:: javascript + + sh.commitReshardCollection("sales.orders") + +.. seealso:: + + :ref:`sharding-resharding` diff --git a/source/reference/method/sh.reshardCollection.txt b/source/reference/method/sh.reshardCollection.txt new file mode 100644 index 00000000000..b5c2174cc86 --- /dev/null +++ b/source/reference/method/sh.reshardCollection.txt @@ -0,0 +1,241 @@ +====================== +sh.reshardCollection() +====================== + +.. default-domain:: mongodb + +.. contents:: On this page + :local: + :backlinks: none + :depth: 1 + :class: singlecol + +Definition +---------- + +.. method:: sh.reshardCollection(namespace, key, unique, options) + + .. versionadded:: 5.0 + + The :method:`sh.reshardCollection()` method changes the shard key for + a collection and changes the distribution of your data. + + :method:`sh.reshardCollection()` takes the following arguments: + + .. list-table:: + :header-rows: 1 + :widths: 20 20 80 + + * - Parameter + + - Type + + - Description + + * - ``namespace`` + + - string + + - The :term:`namespace` of the collection to shard in the form + ``"."``. + + * - ``key`` + + - document + + - The document that specifies the new field or fields to use as the + :doc:`shard key `. + + ``{ : <1|"hashed">, ... }`` + + Set the field values to either: + + - ``1`` for :doc:`ranged based sharding ` + + - ``"hashed"`` to specify a + :ref:`hashed shard key `. + + See also :ref:`sharding-shard-key-indexes` + + * - ``unique`` + + - boolean + + - Optional. Specify whether there is a :doc:`uniqueness + ` constraint on the shard key. Only + ``false`` is supported. Defaults to ``false``. + + * - ``options`` + + - document + + - Optional. A document containing optional fields, including + ``numInitialChunks``, ``collation`` and ``zones``. + + +The ``options`` argument supports the following options: + +.. list-table:: + :header-rows: 1 + :widths: 20 20 80 + + * - Parameter + + - Type + + - Description + + * - ``numInitialChunks`` + + - integer + + - Optional. Specifies the initial number of chunks to create + across all shards in the cluster when resharding a collection. + The default is the number of chunks that exist for the + collection under the current shard key pattern. MongoDB will + then create and balance chunks across the cluster. The + ``numInitialChunks`` must result in less than ``8192`` per shard. + + * - ``collation`` + + - document + + - Optional. If the collection specified to ``reshardCollection`` + has a default :doc:`collation `, + you *must* include a collation document with + ``{ locale : "simple" }``, or the ``reshardCollection`` + command fails. + + * - ``zones`` + + - array + + - Optional. To maintain or add :ref:`zones `, + specify the zones for your collection in an array: + + .. code-block:: javascript + + [ + { + min: , + max: , + zone: | null + }, + ... + ] + + +Resharding Process +------------------ + +During the resharding process, there are two roles a shard may play: + +- **Donors** are shards which currently own chunks of the sharded + collection. +- **Recipients** are shards which would own chunks of the sharded + collection according to the new shard key and zones. + +A shard may play both the role of a donor and a recipient concurrently. +Unless zones are being used, the set of donor shards is the same as the +set of recipient shards. + +The config server primary is always chosen as the resharding +coordinator, responsible for initiating each phase of the process. + +Initialization Phase +~~~~~~~~~~~~~~~~~~~~ + +During the initialization phase: + +- The balancer determines the new data distribution for the sharded + collection. + +Index Phase +~~~~~~~~~~~ + +During the index phase: + +- Each shard recipient creates a new, empty sharded collection with the + same collection options as the existing sharded collection. This new + sharded collection is the target for where recipient shards write the + new data. +- Each shard recipient builds the necessary new indexes. These include + all existing indexes on the sharded collection and an index compatible + with the new shard key pattern if such an index doesn’t already exist on + the sharded collection. + +Clone, Apply, and Catch-up Phase +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +During the clone, apply, and catch-up phase: + +- Each shard recipient clones an initial copy of the documents it would + own under the new shard key. +- Each shard recipient begins applying oplog entries from operations + that happened after the recipient cloned the data. +- When the estimate for the time remaining to complete the resharding + operation is under **two seconds**, the resharding coordinator blocks + writes for the collection. + + .. note:: + + If desired, you can manually force the resharding operation to + complete by issuing the :method:`sh.commitReshardCollection()` + method. This is useful if the current time estimate to complete the + resharding operation is an acceptable duration for your collection + to block writes. The :method:`sh.commitReshardCollection()` method + blocks writes early and forces the resharding operation to + complete. During the time period where writes are blocked your + application experiences an increase in latency. + +.. _resharding-commit-phase-method: + +Commit Phase +~~~~~~~~~~~~ + +- Once the resharding process reaches the commit phase, it may no longer + be aborted with :method:`sh.abortReshardCollection()`. +- When all shards have reached strict consistency, the resharding + coordinator commits the resharding operation and installs the new + routing table. +- The resharding coordinator instructs each donor and recipient shard + primary, independently, to rename the temporary sharded collection. + The temporary collection becomes the new resharded collection. +- Each donor shard drops the old sharded collection. + + .. seealso:: + + :ref:`sharding-resharding` + +Example +------- + +Reshard a Collection +~~~~~~~~~~~~~~~~~~~~ + +The following example reshards the ``sales.orders`` collection with the +new shard key ``{ order_id: 1 }``: + +.. code-block:: javascript + + sh.reshardCollection("sales.orders", { order_id: 1 }) + +MongoDB returns the following: + +.. code-block:: javascript + + { + ok: 1, + '$clusterTime': { + clusterTime: Timestamp(1, 1624887954), + signature: { + hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0), + keyId: 0 + } + }, + operationTime: Timestamp(1, 1624887947) + } + +.. seealso:: + + :ref:`sharding-resharding` diff --git a/source/reference/method/sh.shardCollection.txt b/source/reference/method/sh.shardCollection.txt index ae69cae7fba..e7d1c2108b3 100644 --- a/source/reference/method/sh.shardCollection.txt +++ b/source/reference/method/sh.shardCollection.txt @@ -189,16 +189,14 @@ Considerations Shard Keys ~~~~~~~~~~ -Choosing the best shard key to effectively distribute load among your -shards requires some planning. +While you can :ref:`change your shard key ` +later, it is important to carefully consider your shard key choice to +avoid scalability and perfomance issues. -- Starting in MongoDB 4.4, you can refine a collection's shard key by - adding a suffix field or fields to the existing key. - -- Starting in MongoDB 4.2, you can update a document's shard key value - (unless the shard key field is the immutable ``_id`` field). +.. seealso:: -For more information, see :ref:`sharding-shard-key`. + - :ref:`sharding-shard-key` + - :ref:`sharding-shard-key-selection` Hashed Shard Keys ~~~~~~~~~~~~~~~~~ diff --git a/source/reference/sharding.txt b/source/reference/sharding.txt index c47550d6615..eaaebe9d845 100644 --- a/source/reference/sharding.txt +++ b/source/reference/sharding.txt @@ -21,6 +21,12 @@ Sharding Methods in the ``mongo`` Shell - Description + * - :method:`sh.abortReshardCollection()` + + - Aborts a :ref:`resharding operation `. + + .. versionadded:: 5.0 + * - :method:`sh.addShard()` - Adds a :term:`shard` to a sharded cluster. @@ -44,6 +50,13 @@ Sharding Methods in the ``mongo`` Shell .. versionadded:: 4.4 + * - :method:`sh.commitReshardCollection()` + + - Forces a :ref:`resharding operation ` to + block writes and complete. + + .. versionadded:: 5.0 + * - :method:`sh.disableBalancing()` - Disable balancing on a single collection in a sharded database. Does not affect balancing of other collections in a sharded cluster. @@ -100,6 +113,13 @@ Sharding Methods in the ``mongo`` Shell - Removes the association between a shard and a zone. Use to manage :ref:`zone sharding `. + * - :method:`sh.reshardCollection()` + + - Initiates a :ref:`resharding operation ` to change the + shard key for a collection, changing the distribution of your data. + + .. versionadded:: 5.0 + * - :method:`sh.setBalancerState()` - Enables or disables the :term:`balancer` which migrates :term:`chunks ` between :term:`shards `. @@ -163,6 +183,12 @@ The following database commands support :term:`sharded clusters - Description + * - :dbcommand:`abortReshardCollection` + + - Aborts a :ref:`resharding operation `. + + .. versionadded:: 5.0 + * - :dbcommand:`addShard` - Adds a :term:`shard` to a :term:`sharded cluster`. @@ -202,6 +228,19 @@ The following database commands support :term:`sharded clusters - Removes orphaned data with shard key values outside of the ranges of the chunks owned by a shard. + * - :dbcommand:`cleanupReshardCollection` + + - Cleans up a failed :ref:`resharding operation `. + + .. versionadded:: 5.0 + + * - :dbcommand:`commitReshardCollection` + + - Forces a :ref:`resharding operation ` to + block writes and complete. + + .. versionadded:: 5.0 + * - :dbcommand:`enableSharding` - Enables sharding on a specific database. @@ -250,6 +289,13 @@ The following database commands support :term:`sharded clusters - Removes the association between a shard and a :term:`zone`. Supports configuring :ref:`zones ` in sharded clusters. + * - :dbcommand:`reshardCollection` + + - Initiates a :ref:`resharding operation ` to change the + shard key for a collection, changing the distribution of your data. + + .. versionadded:: 5.0 + * - :dbcommand:`setShardVersion` - Internal command to sets the :term:`config server ` version. diff --git a/source/release-notes/5.0-downgrade-sharded-cluster.txt b/source/release-notes/5.0-downgrade-sharded-cluster.txt index 9254c75b610..5336f505913 100644 --- a/source/release-notes/5.0-downgrade-sharded-cluster.txt +++ b/source/release-notes/5.0-downgrade-sharded-cluster.txt @@ -34,10 +34,23 @@ Prerequisites To downgrade from |newversion| to |oldversion|, you must remove incompatible features that are persisted and/or update incompatible -configuration settings. +configuration settings. + +.. note:: + + If you ran a :ref:`resharding operation ` and a + primary failover occurred during the resharding operation, you must + run :dbcommand:`cleanupReshardCollection` before downgrading the + ``featureCompatibilityVersion`` of your sharded cluster. + + If you started a :ref:`resharding operation `, + and it is still running while you downgrade the + ``featureCompatibilityVersion`` of your sharded cluster, the + resharding operation will abort. + .. |target| replace:: :binary:`~bin.mongos` instance - + .. _5.0-downgrade-feature-compatibility-sharded-cluster: 1. Downgrade Feature Compatibility Version (fCV) diff --git a/source/release-notes/5.0.txt b/source/release-notes/5.0.txt index fd7ab0c0826..06ca1aee6ed 100644 --- a/source/release-notes/5.0.txt +++ b/source/release-notes/5.0.txt @@ -246,6 +246,17 @@ should permit when using TLS 1.3 encryption. Sharded Clusters ---------------- +Resharding +~~~~~~~~~~ + +The ideal shard key allows MongoDB to distribute documents evenly +throughout the cluster while facilitating common query patterns. A +suboptimal shard key can lead to performance or scaling issues due to +uneven data distribution. Starting in MongoDB 5.0, you can use the +:dbcommand:`reshardCollection` command to :ref:`change the shard key +` for a collection to change the distribution of +your data across your cluster. + ``currentOp`` Reports Ongoing Resharding Operations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/source/sharding.txt b/source/sharding.txt index d843d3ea273..50597500495 100644 --- a/source/sharding.txt +++ b/source/sharding.txt @@ -110,10 +110,11 @@ consists of a field or multiple fields in the documents. You select the shard key when :ref:`sharding a collection `. -- Starting in MongoDB 4.4, you can refine a collection's shard key by - adding a suffix field or fields to the existing key. See - :dbcommand:`refineCollectionShardKey` for details. - +- Starting in MongoDB 5.0, you can :ref:`reshard a collection + ` by changing a collection's shard key. +- Starting in MongoDB 4.4, you can :ref:`refine a shard key + ` by adding a suffix field or fields to the existing + shard key. - In MongoDB 4.2 and earlier, the choice of shard key cannot be changed after sharding. @@ -145,8 +146,9 @@ infrastructure can be bottlenecked by the choice of shard key. The choice of shard key and its backing index can also affect the :ref:`sharding strategy ` that your cluster can use. -See :ref:`sharding-shard-key-selection` documentation for more -information. +.. seealso:: + + :ref:`sharding-shard-key-selection` Chunks ------ @@ -163,7 +165,9 @@ shards in the cluster, a :doc:`balancer ` runs in the background to migrate :term:`chunks` across the :term:`shards` . -See :doc:`/core/sharding-data-partitioning` for more information. +.. seealso:: + + :doc:`/core/sharding-data-partitioning` Advantages of Sharding ---------------------- @@ -186,7 +190,7 @@ operations` are generally more efficient than Starting in MongoDB 4.4, :binary:`~bin.mongos` can support :ref:`hedged reads ` to minimize latencies. - + Storage Capacity ~~~~~~~~~~~~~~~~ @@ -216,13 +220,16 @@ careful planning, execution, and maintenance. .. include:: /includes/fact-cannot-unshard-collection.rst -Careful consideration in choosing the shard key is necessary for -ensuring cluster performance and efficiency. See -:ref:`sharding-shard-key-selection`. +While you can :ref:`reshard your collection ` +later, it is important to carefully consider your shard key choice to +avoid scalability and perfomance issues. + +.. seealso:: + + :ref:`sharding-shard-key-selection` -Sharding has certain :ref:`operational requirements and -restrictions `. See -:doc:`/core/sharded-cluster-requirements` for more information. +To understand the operational requirements and restrictions for sharding +your collection, see :doc:`/core/sharded-cluster-requirements`. If queries do *not* include the shard key or the prefix of a :term:`compound ` shard key, :binary:`~bin.mongos` performs @@ -259,8 +266,8 @@ perform read or write operations. .. include:: /images/sharded-cluster-mixed.rst You can connect to a :binary:`~bin.mongos` the same way you connect to a -:binary:`~bin.mongod`, such as via the :binary:`~bin.mongo` shell or a MongoDB -:driver:`driver `. +:binary:`~bin.mongod` using the :mongosh:`MongoDB Shell ` or a +MongoDB :driver:`driver `. .. _sharding-strategy: diff --git a/source/tutorial/clear-jumbo-flag.txt b/source/tutorial/clear-jumbo-flag.txt index 6ca77a43a84..6df403ffd28 100644 --- a/source/tutorial/clear-jumbo-flag.txt +++ b/source/tutorial/clear-jumbo-flag.txt @@ -44,8 +44,9 @@ In some instances, MongoDB cannot split the no-longer ``jumbo`` chunk, such as a chunk with a range of single shard key value. As such, you cannot split the chunk to clear the flag. -In such cases, you can either refine the shard key so that the chunk -can become divisible or manually clear the flag. +In such cases, you can either :ref:`change the shard key +` so that the chunk can become divisible or manually +clear the flag. Refine the Shard Key ```````````````````` diff --git a/source/tutorial/troubleshoot-sharded-clusters.txt b/source/tutorial/troubleshoot-sharded-clusters.txt index 0fcd127cb9a..4294a34fafc 100644 --- a/source/tutorial/troubleshoot-sharded-clusters.txt +++ b/source/tutorial/troubleshoot-sharded-clusters.txt @@ -105,21 +105,16 @@ warning will repeat until all the :binary:`~bin.mongos` instances refresh their caches. To force an instance to refresh its cache, run the :dbcommand:`flushRouterConfig` command. -Shard Keys and Cluster Availability ------------------------------------ +Shard Keys +---------- -The most important consideration when choosing a :term:`shard key` -are: +To troubleshoot a shard key, see +:ref:`shardkey-troubleshoot-shard-keys`. -- to ensure that MongoDB will be able to distribute data evenly among - shards, and +Cluster Availability +-------------------- -- to scale writes across the cluster, and - -- to ensure that :binary:`~bin.mongos` can isolate most queries to a specific - :binary:`~bin.mongod`. - -Furthermore: +To ensure cluster availability: - Each shard should be a :term:`replica set`, if a specific :binary:`~bin.mongod` instance fails, the replica set members will elect @@ -127,17 +122,12 @@ Furthermore: entire shard is unreachable or fails for some reason, that data will be unavailable. -- If the shard key allows the :binary:`~bin.mongos` to isolate most - operations to a single shard, then the failure of a single shard - will only render *some* data unavailable. - -- If your shard key distributes data required for every operation - throughout the cluster, then the failure of the entire shard will - render the entire cluster unavailable. - -In essence, this concern for reliability simply underscores the -importance of choosing a shard key that isolates query operations to a -single shard. +- The shard key should allow the :binary:`~bin.mongos` to isolate most + operations to a single shard. If operations can be processed by a + single shard, the failure of a single shard will only render *some* + data unavailable. If operations need to access all shards for queries, + the failure of a single shard will render the entire cluster + unavailable. .. _config-database-string-error: