Skip to content

Conversation

@jasontedor
Copy link
Member

This commit introduces a background sync for retention leases. The idea here is that we do a heavyweight sync when adding a new retention lease, and then periodically we want to background sync any retention lease renewals to the replicas. As long as the background sync interval is significantly lower than the extended lifetime of a retention lease, it is okay if from time to time a replica misses a sync (it will still have an older version of the lease that is retaining more data as we assume that renewals do not decrease the retaining sequence number). There are two follow-ups that will come after this commit. The first is to address the fact that we have not adapted the should periodically flush logic to possibly flush the retention leases. We want to do something like flush if we have not flushed in the last five minutes and there are renewed retention leases since the last time that we flushed. An additional follow-up will remove the syncing of retention leases when a retention lease expires. Today this sync could be invoked in the background by a merge operation. Rather, we will move the syncing of retention lease expiration to be done under the background sync. The background sync will use the heavyweight sync (write action) if a lease has expired, and will use the lightweight background sync (replication action) otherwise.

Relates #37165

This commit introduces a background sync for retention leases. The idea
here is that we do a heavyweight sync when adding a new retention lease,
and then periodically we want to background sync any retention lease
renewals to the replicas. As long as the background sync interval is
significantly lower than the extended lifetime of a retention lease, it
is okay if from time to time a replica misses a sync (it will still have
an older version of the lease that is retaining more data as we assume
that renewals do not decrease the retaining sequence number). There are
two follow-ups that will come after this commit. The first is to address
the fact that we have not adapted the should periodically flush logic to
possibly flush the retention leases. We want to do something like flush
if we have not flushed in the last five minutes and there are renewed
retention leases since the last time that we flushed. An additional
follow-up will remove the syncing of retention leases when a retention
lease expires. Today this sync could be invoked in the background by a
merge operation. Rather, we will move the syncing of retention lease
expiration to be done under the background sync. The background sync
will use the heavyweight sync (write action) if a lease has expired, and
will use the lightweight background sync (replication action) otherwise.
@jasontedor jasontedor added >enhancement v7.0.0 :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. v6.7.0 labels Feb 2, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I left a point to discuss.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left two minor comments. LGTM.

protected PrimaryResult<Request, ReplicationResponse> shardOperationOnPrimary(final Request request, final IndexShard primary) {
Objects.requireNonNull(request);
Objects.requireNonNull(primary);
primary.afterWriteOperation();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this call is useful for it does not change translog nor Lucene.

Objects.requireNonNull(request);
Objects.requireNonNull(replica);
replica.updateRetentionLeasesOnReplica(request.getRetentionLeases());
replica.afterWriteOperation();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here - I am not sure if this call is useful for it does not change translog nor Lucene.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to trigger periodic flush. Let us discuss the exact mechanics of this in a follow-up as that work develops.

@jasontedor jasontedor merged commit 625d37a into elastic:master Feb 4, 2019
@jasontedor jasontedor deleted the retention-lease-background-sync branch February 4, 2019 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. >enhancement v6.7.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants