Skip to content

Conversation

@petr-zededa
Copy link
Contributor

We should add an ability to calculate remaining space based on CurrentSize of volume of particular app we try to purge/update inside volumemgr.
On step of volume creation inside volumemgr we do not now for which application it was created (or it is not linked to any application). So, it seems reasonable to add ApplicationID field into Volume Status/Config to have opportunity to determinate for which app we use it.

There are open questions for me:

  • may the same volume be linked to the several AppInstances at the same time
  • what to do in the situation when we add new volume (not the first one) to the existing application. Now it is quite difficult to determinate where we are on the step of Volume creation inside volumemgr. We do not know anything about expected state of app inside this step.

cc @zed-rishabh

Signed-off-by: Petr Fedchenkov [email protected]

@giggsoff giggsoff force-pushed the volume-space-calculation branch from 95c48e2 to 2713e1a Compare March 4, 2021 12:27
@rvs
Copy link
Contributor

rvs commented Mar 4, 2021

Hey @giggsoff -- it would be really helpful if you can first describe one (or a few) scenarios that you're trying to address here. The reason I'm asking is b/c if you're looking for a bullet-proof accounting -- we still can't do that reliably. However, it doesn't mean we shouldn't address various usecases that can be easy to adress.

So... please describe your scenarios first.

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to add some description, but I think I understand the interesting case.

However, in general a read-only volume can be shared by multiple app instances, and an app instance can have multiple volumes, and finally when purging some subset of the volumes will be purged. That is controlled by the controller.
Thus to do the accounting well I think the controller needs to indicate in the volume API some "intended to replace volume UUID X". That way you can tell how much more space will be needed during the transient (while downloading and creating the replacement) and once the replacement has taken place.

zedmanager could try to infer this replacement and add the UUID to the VolumeRefConfig instead of getting it from the controller. But that is only useful if volumemgr seens the VolumeRefConfig before it sees the VolumeConfig and that typically doesn't happen.

@petr-zededa
Copy link
Contributor Author

Sorry for delayed response.
Actually, here I try to leverage the simply process that I can reproduce with Eden: I increase generation count of volume and contentTree without modifications of data (volume content). I cannot reproduce it with Zedcontrol now. Can I modify Edge app inside marketplace and will it be updated during purge&&restart?
In case of multiple attachments my assumption is wrong. VolumeConfig now comes into volumemgr before VolumeRefConfig, but we can try to modify the sequence in zedagent and add "app_uuid" into VolumeRefConfig.
"Volume_UUID_1 is replacement for Volume_UUID_2" seems quite tricky. In general we have set of volumes (which may be not linked to any apps) and set of refs (which can be modified in any proportions, so it may be several volumes as replacement for the old one, or one additional volume and one "replacement", or just additional volume).

@eriknordmark
Copy link
Contributor

Actually, here I try to leverage the simply process that I can reproduce with Eden: I increase generation count of volume and contentTree without modifications of data (volume content). I cannot reproduce it with Zedcontrol now. Can I modify Edge app inside marketplace and will it be updated during purge&&restart?

Yes, but when you do that I think the controller sends a new volume UUID.

In case of multiple attachments my assumption is wrong. VolumeConfig now comes into volumemgr before VolumeRefConfig, but we can try to modify the sequence in zedagent and add "app_uuid" into VolumeRefConfig.

No, changing when zedagent publishes things doesn't guarantee anything about when the subscribers sees it; the system is asynchronous. The order is undefined. For instance, today if volumemgr is busy it might see the VolumeRefConfig before the VolumeConfig. If interlock is needed it must be explicit.

"Volume_UUID_1 is replacement for Volume_UUID_2" seems quite tricky. In general we have set of volumes (which may be not linked to any apps) and set of refs (which can be modified in any proportions, so it may be several volumes as replacement for the old one, or one additional volume and one "replacement", or just additional volume).

Here the semantics of "replacement" is solely for the purpose of calculating disk usage. for the purge operation If an app with 2 volumes is replaced by an app with 4 volumes then the controller can decide what it wants to label as replacing what, and the calculation will still be correct.
What is important is that during the transient of the operation the app will use sum_current_size(current_volumes) + sum_download_size(new_volumes) and once done it will use sum_current_size(volumes) with a max of sum_qcow2_max(volumes)

@giggsoff giggsoff force-pushed the volume-space-calculation branch from 2713e1a to f93782c Compare March 10, 2021 16:01
@petr-zededa
Copy link
Contributor Author

In the last version I add Stale field into VolumeConfig struct to indicate if the particular volume has no apps pointing on it in the config from controller. I use this field in the check of getRemainingDiskSpace to take into account only current size of volume.
If I correctly understand current logic inside controller, it will send appImage with refs to the new volumes and then delete old ones. So, here we will mark Stale volumes and not account their maximum sizes.

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits about naming and comments.

But this new flow should work in terms of the overall flow and risk of race conditions between zedagent, zedmanager, and volumemgr.
However, there is one detail in the processing of the updates in volumemgr which can make the approach fail:
When zedagent publish a modification to VolumeConfig1 to set Stale and publishes a new VolumeConfig2 without setting stale, there is no guarantee on the order those are processed in pubsub hence in volumemgr. If volumemgr processes the new VolumeConfig2 before the change to VolumeConfig1 it will not know of the Stale setting. In that case will it not fail to take into account the Stale?
AFAICT if volumemgr is busy I think this will fail 50% of the time - pubsub calculates the set of keys which have been added/modified/deleted and that is based on the order in a golang map.

Signed-off-by: Petr Fedchenkov <[email protected]>
Signed-off-by: Petr Fedchenkov <[email protected]>
@giggsoff giggsoff force-pushed the volume-space-calculation branch from f93782c to 6ffa397 Compare March 24, 2021 11:11
@petr-zededa petr-zededa changed the title [WIP] taking into account purgeable space on volume creation taking into account purgeable space on volume creation Mar 24, 2021
Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - let it test.
We need to verify that the purging tests in ring1 are indeed run with this.
Also, we probably do not have a test case when 2x max size would exceed the disk size.

Signed-off-by: Petr Fedchenkov <[email protected]>
@giggsoff giggsoff force-pushed the volume-space-calculation branch from c891b37 to 97e7b80 Compare March 25, 2021 10:35
@eriknordmark
Copy link
Contributor

@petr-zededa how are we testing this to make sure purge doesn't assume the 2x max as prior to the fix? Do we need new tests?

@petr-zededa
Copy link
Contributor Author

@petr-zededa how are we testing this to make sure purge doesn't assume the 2x max as prior to the fix? Do we need new tests?

I am working on basic support for modify of apps inside Eden lf-edge/eden#545

@eriknordmark eriknordmark merged commit 7474a7d into lf-edge:master Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants