Skip to content

Conversation

@dnhatn
Copy link
Member

@dnhatn dnhatn commented Jun 21, 2018

Today we make sure that a 5.x index commit should have all required
commit tags in RecoveryTarget#cleanFiles method. The reason we do this
in RecoveryTarget#cleanFiles method because this is only needed in a
file-based recovery and we assume that #cleanFiles should be called in a
file-based recovery. However, this assumption is not valid if the index
is sealed (.i.e synced-flush). This incorrect assumption would prevent
users from rolling upgrade from 5.x to 6.3 if their index were sealed.

Closes #31482

Today we make sure that a 5.x index commit should have all required
commit tags in RecoveryTarget#cleanFiles method. The reason we do this
in RecoveryTarget#cleanFiles method because this is only needed in a
file-based recovery and we assume that #cleanFiles should be called in a
file-based recovery. However, this assumption is not valid if the index
is sealed (.i.e synced-flush). This incorrect assumption would prevent
users from rolling upgrade from 5.x to 6.3 if their index were sealed.
@dnhatn dnhatn added >bug >upgrade :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v6.4.0 v6.3.1 labels Jun 21, 2018
@dnhatn dnhatn requested review from bleskes and ywelsch June 21, 2018 14:32
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some nits but looks good o.w.

createIndex(index, settings.build());
indexRandomDocuments(count, randomBoolean(), randomBoolean(),
n -> JsonXContent.contentBuilder().startObject().field("key", "value").endObject());
assertBusy(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this assertBusy by an ensureGreen(index)?

*/
public void testRecoverySealedIndex() throws Exception {
int count;
if (runningAgainstOldCluster) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe fold this if-else into the next one.

mappingsAndSettings.endObject();
client().performRequest("PUT", "/" + index, Collections.emptyMap(),
new StringEntity(Strings.toString(mappingsAndSettings), ContentType.APPLICATION_JSON));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove extra line

* Tests that a synced-flushed index is correctly recovered.
* This might be an edge-case from 5.x to 6.x since a 5.x index commit does not have all required 6.x commit tags.
*/
public void testRecoverySealedIndex() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testRecoverSyncedFlushIndex

}
}

public void testRecoverySealedIndex() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testRecoverSyncedFlushIndex

}

public void testRecoverySealedIndex() throws Exception {
final String index = "recovery_a_sealed_index";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synced_flush_index

.put(SETTING_ALLOCATION_MAX_RETRY.getKey(), "0"); // fail faster
createIndex(index, settings.build());
indexDocs(index, 0, randomInt(5));
assertBusy(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above, replace by ensureGreen()?

@dnhatn
Copy link
Member Author

dnhatn commented Jun 21, 2018

@ywelsch Thanks for looking.

Can we replace this assertBusy by an ensureGreen(index)?

Unfortunately, we have to spin here. We fire a global checkpoint sync after the last op is completed. A synced-flush will fail if the global checkpoint is happening. I added a comment to explain this.

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I wonder if #31291 is relevant (it is different because it doesn't fail to recover). I think it's worth while double checking that.

* Tests that a synced-flushed index is correctly recovered.
* This might be an edge-case from 5.x to 6.x since a 5.x index commit does not have all required 6.x commit tags.
*/
public void testRecoverSyncedFlushIndex() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we fold this into testRecovery and randomly do a synced flush instead of a normal flush?

@dnhatn
Copy link
Member Author

dnhatn commented Jun 21, 2018

I wonder if #31291 is relevant (it is different because it doesn't fail to recover). I think it's worth while double checking that.

Sure, will do.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 22, 2018

Thanks @ywelsch and @bleskes for reviewing!

@dnhatn dnhatn merged commit 9100a70 into elastic:6.x Jun 22, 2018
@dnhatn dnhatn deleted the fix-missing-historyuuid branch June 22, 2018 03:29
dnhatn added a commit that referenced this pull request Jun 22, 2018
…6.3 (#31506)

Today we make sure that a 5.x index commit should have all required
commit tags in RecoveryTarget#cleanFiles method. The reason we do this
in RecoveryTarget#cleanFiles method because this is only needed in a
file-based recovery and we assume that #cleanFiles should be called in a
file-based recovery. However, this assumption is not valid if the index
is sealed (.i.e synced-flush). This incorrect assumption would prevent
users from rolling upgrade from 5.x to 6.3 if their index were sealed.

Closes #31482
dnhatn added a commit that referenced this pull request Jun 23, 2018
Although the master branch does not affect by #31482, it's helpful to
have BWC tests that verify the peer recovery with a synced-flush index.
This commit adds the bwc tests from #31506 to the master branch.

Relates #31482
Relates #31506
@dnhatn
Copy link
Member Author

dnhatn commented Jun 23, 2018

Although the master branch is not affected by this issue, I think it's nice to have these BWC tests in the master branch. I added these tests to the master branch in 5115102.

dnhatn added a commit that referenced this pull request Jun 23, 2018
* 6.x:
  Avoid sending duplicate remote failed shard requests (#31313)
  Add get field mappings to High Level REST API Client Relates to #27205
  [DOCS] Updates Watcher examples for code testing (#31152)
  [DOCS] Move monitoring to docs folder (#31477)
  [DOCS] Fixes SQL docs in nav
  [DOCS] Move sql to docs
  IndexShard should not return null stats - empty stats or AlreadyCloseException if it's closed is better
  Clarify that IP range data can be specified in CIDR notation. (#31374)
  Remove some cases in FieldTypeLookupTests that are no longer relevant. (#31381)
  In NumberFieldType equals and hashCode, make sure that NumberType is taken into account. (#31514)
  fix repository update with the same settings but different type
  Revert "AwaitsFix FullClusterRestartIT#testRecovery"
  Upgrade to Lucene 7.4.0. (#31529)
  Avoid deprecation warning when running the ML datafeed extractor. (#31463)
  Retry synced-flush in FullClusterRestartIT#testRecovery
  Allow multiple unicast host providers (#31509)
  [ML] Add ML filter update API (#31437)
  AwaitsFix FullClusterRestartIT#testRecovery
  Fix missing historyUUID in peer recovery when rolling upgrade 5.x to 6.3 (#31506)
  Remove QueryCachingPolicy#ALWAYS_CACHE (#31451)
  Rename createNewTranslog to fileBasedRecovery (#31508)
  [DOCS] Add code snippet testing in more ML APIs (#31339)
  [DOCS] Remove fixed file from build.gradle
  [DOCS] Creates field and document level security overview (#30937)
  Test: Skip assertion on windows
  [DOCS] Move migration APIs to docs (#31473)
  Add a known issue for upgrading from 5.x to 6.3.0 (#31501)
  Return transport addresses from UnicastHostsProvider (#31426)
  Add Delete Snapshot High Level REST API
  Reload secure settings for plugins (#31481)
  [DOCS] Fix JDBC Maven client group/artifact ID
colings86 pushed a commit that referenced this pull request Jun 25, 2018
Although the master branch does not affect by #31482, it's helpful to
have BWC tests that verify the peer recovery with a synced-flush index.
This commit adds the bwc tests from #31506 to the master branch.

Relates #31482
Relates #31506
@bleskes
Copy link
Contributor

bleskes commented Jun 25, 2018

I added these tests to the master branch in 5115102.

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >upgrade v6.3.1 v6.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants