Skip to content

Conversation

nixpanic
Copy link

What type of PR is this?

Documentation change, setting expectations of the capacity_range sizes in CreateVolume and other RPCs.

What this PR does / why we need it:

Users have been confused about provisioned volumes that were slightly smaller than what they requested.

Which issue(s) this PR fixes:

Fixes #338

Special notes for your reviewer:

A similar note was added through kubernetes/website#50146.

Does this PR introduce an API-breaking change?:

none

spec.md Outdated
If CO requests a volume to be created from existing snapshot or volume and the requested size of the volume is larger than the original snapshotted (or cloned volume), the Plugin can either refuse such a call with `OUT_OF_RANGE` error or MUST provide a volume that, when presented to a workload by `NodePublish` call, has both the requested (larger) size and contains data from the snapshot (or original volume).
Explicitly, it's the responsibility of the Plugin to resize the filesystem of the newly created volume at (or before) the `NodePublish` call, if the volume has `VolumeCapability` access type `MountVolume` and the filesystem resize is required in order to provision the requested capacity.

NOTE: The requested and reported sizes for a volume refer to the raw volume size that the SP allocated for the volume. Volumes with an `access_type` of `MountVolume` MAY have a slightly reduced available capacity due to the metadata that the filesystem stores on the volume. The underlying volume when used in `access_mode` with a value of `BlockVolume` MUST match the specified `capacity_range` request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useful note, but I don't agree with the "slightly" characterization. In reality it may be significantly less, and the sad truth is we don't enforce any limit. We should use more precise language here. Perhaps:

NOTE: The requested and reported sizes for a volume refer to the raw volume size that the SP allocated for the volume. Volumes with an access_type of MountVolume will have less available capacity due to the overhead of filesystem metadata (exact amount depends on filesystem). The underlying volume when used in access_mode with a value of BlockVolume MUST match the specified capacity_range request.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, @bswartz! I've adopted your phrasing a little to include the MAY as counter reference for MUST related to the block-mode volume. To me stating will have less is too strong, a csi-driver may take the filesystem metadata into account, or even provision a volume with a larger capacity (following the capacity_range parameter).

Please have a look again, thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to a GitHub outage the PR does not seem to have been updated (yet?). The latest commit with the adjusted text is at nixpanic@f319d43

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I left out a more subtle detail which I based my "will have less" language on, which is that many volumes support both filesystem and raw-block and in those cases, the size is expected to be the same across both. For example if you create a filesystem volume with size 2GiB, and the SP returns a volume with 2GiB nominally, there is an assumption that if you were to later convert that volume to a raw-block volume that it would be EXACTLY 2GiB just as if it has been a block volume from the beginning. Under these conditions, adding a filesystem is guaranteed to take away some space and leave you with less capacity. And if an SP wanted to estimate the filesystem overhead and create a larger block volume (for example, estimate <5% overhead and create a 2.1 GiB volume in response to a 2.0 GiB request, so the filesystem would end up with >= 2.0 GiB usable) then the SP would need to report the size of the block volume, i.e. 2.1 GiB and the usable space will still be less than that.

I supposed there's the case of a NFS volume which can't be converted to raw-block, in which case the available space could be exactly the amount reported or even higher. These details aren't discussed in the spec anywhere, but the filesystem/raw-block duality that many drivers support has caused some applications to begin to depend on this behavior and we should probably document it.

When a CreateVolume request is handled by an SP, the requested capacity
does not include any potential overhead of filesystem metadata. This can
cause confusion, as users may not be able to store data up to the amount
of the requested size of the volume.

The SP is expected to provision a volume following the capacity_range
request, but the SP may not know how much storage for filesystem
metadata will be required (depends on filesystem type and enabled
features). In these cases, the usable capacity of the volume will be
slightly smaller than the requested capacity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clarify definition of Capacity in the spec

2 participants