-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add link to MAX_RETRY allocation explain message #113657
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -162,7 +162,7 @@ node. | |||||
====== Maximum number of retries exceeded | ||||||
|
||||||
The following response contains an allocation explanation for an unassigned | ||||||
primary shard that has reached the maximum number of allocation retry attempts. | ||||||
primary shard that has reached the maximum number of allocation retry attempts. | ||||||
|
||||||
[source,js] | ||||||
---- | ||||||
|
@@ -195,17 +195,20 @@ primary shard that has reached the maximum number of allocation retry attempts. | |||||
{ | ||||||
"decider": "max_retry", | ||||||
"decision" : "NO", | ||||||
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2024-07-30T21:04:12.166Z], failed_attempts[5], failed_nodes[[mEKjwwzLT1yJVb8UxT6anw]], delayed=false, details[failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException], allocation_status[deciders_no]]]" | ||||||
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [POST /_cluster/reroute?retry_failed=true] to retry, and for more information, see https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-allocation-explain.html#_maximum_number_of_retries_exceeded, [unassigned_info[[reason=ALLOCATION_FAILED], at[2024-07-30T21:04:12.166Z], failed_attempts[5], failed_nodes[[mEKjwwzLT1yJVb8UxT6anw]], delayed=false, details[failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException], allocation_status[deciders_no]]]" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Until we solidify the URL I vote leaving it off. Then Github suggestion to apply Dave's comment I believe would appear as:
Suggested change
|
||||||
} | ||||||
] | ||||||
} | ||||||
] | ||||||
} | ||||||
---- | ||||||
// NOTCONSOLE | ||||||
|
||||||
If decider message indicates a transient allocation issue, use | ||||||
<<cluster-reroute,the cluster reroute API>> to retry allocation. | ||||||
This message indicates that the cluster was previously unable to | ||||||
allocate this shard and chose to put a hold on further attempts. | ||||||
This is done to avoid burdening the cluster with repeated requests that will fail. | ||||||
If no other `no` decisions are present, then the transient allocation issue | ||||||
that caused these failures has most likely been resolved, and you can use the | ||||||
<<cluster-reroute,the cluster reroute API>> to retry allocation. | ||||||
Comment on lines
+209
to
+211
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this will be confusing, there are normally always some Also I'd rather we used the imperative voice: "use the reroute API" rather than just suggesting "you can ...". Finally there's a duplicate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Brainstorming, I might say
|
||||||
|
||||||
====== No valid shard copy | ||||||
|
||||||
|
@@ -334,7 +337,7 @@ queued to allocate but currently waiting on other queued shards. | |||||
---- | ||||||
// NOTCONSOLE | ||||||
|
||||||
This is a transient message that might appear when a large amount of shards are allocating. | ||||||
This is a transient message that might appear when a large amount of shards are allocating. | ||||||
|
||||||
===== Assigned shard | ||||||
|
||||||
|
@@ -437,7 +440,7 @@ cluster balance. | |||||
===== No arguments | ||||||
|
||||||
If you call the API with no arguments, {es} retrieves an allocation explanation | ||||||
for an arbitrary unassigned primary or replica shard, returning any unassigned primary shards first. | ||||||
for an arbitrary unassigned primary or replica shard, returning any unassigned primary shards first. | ||||||
|
||||||
[source,console] | ||||||
---- | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,5 +43,6 @@ | |
"MAX_SHARDS_PER_NODE": "size-your-shards.html#troubleshooting-max-shards-open", | ||
"FLOOD_STAGE_WATERMARK": "fix-watermark-errors.html", | ||
"X_OPAQUE_ID": "api-conventions.html#x-opaque-id", | ||
"FORMING_SINGLE_NODE_CLUSTERS": "modules-discovery-bootstrap-cluster.html#modules-discovery-bootstrap-cluster-joining" | ||
"FORMING_SINGLE_NODE_CLUSTERS": "modules-discovery-bootstrap-cluster.html#modules-discovery-bootstrap-cluster-joining", | ||
"ALLOCATION_EXPLAIN_MAX_RETRY": "cluster-allocation-explain.html#_maximum_number_of_retries_exceeded" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add a fixed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See #113667 which forbids this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API call in the message is actually
POST /_cluster/reroute?retry_failed&metric=none
- seeorg.elasticsearch.cluster.routing.allocation.decider.MaxRetryAllocationDecider#RETRY_FAILED_API
.