-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect and display execution metadata for ES|QL cross cluster searches #112595
Merged
Merged
Changes from 52 commits
Commits
Show all changes
70 commits
Select commit
Hold shift + click to select a range
6ebc396
Collect and display execution metadata for ES|QL cross cluster searches
quux00 512fdec
Slight improvements to EsqlExecutionInfo
quux00 39428fa
Removed changes to EsqlQueryResponse, spending too long getting the E…
quux00 892cd99
Starting threading EsqlExecutionInfo into PlanExecutor and EsqlSesssi…
quux00 fb10109
Have the initial swap-in of cluster info into EsqlExecutionInfo in Es…
quux00 797cc8c
Added EsqlExecutionInfo to IndexResolver. Enrich pathway passes in nu…
quux00 fa7bbb0
ComputeListener updated to the version that has proper remote/local s…
quux00 71a33ed
Added new tests to ComputeListenerTests
quux00 ab347a6
Added ExecutionInfo to Result obj (used in ComputeService/EsqlSession)
quux00 c39111b
update ExecutionInfo with shard counts in ComputeService.lookupDataNodes
quux00 1a3a7f8
Migrated CrossClustersQueryIT to new setup format, but can't add exec…
quux00 544aaeb
Added CountDown to acquireComputeForDataNodes - that allows SUCCESSFU…
quux00 ca2de85
Fixed failing REST and qa tests to account for the new 'took' time in…
quux00 f839132
Fixed bug where CountDown in ComputeService can be initialized with 0…
quux00 3f3139b
More qa and bwc test fixes based on what failed in latest ci build
quux00 e090437
Next round of qa and bwc test fixes based on what failed in latest ci…
quux00 b2b2542
Fix failing test in EsqlSecurityIT
quux00 5e7876e
Added _cluster/details to the EsqlQueryResponse XContent for cross-cl…
quux00 3e16fbb
Fixed test failure in esql/ccq/MultiClustersIT
quux00 ed5b9db
Updated end user docs with info about top level took time and _cluste…
quux00 661a243
Added EsqlExecutionInfo to equals and hashCode method of EsqlQueryRes…
quux00 9535bdd
Removed skip_unavilable=true filter in IndexResolver - all clusters a…
quux00 699b16a
Moved isRemoteUnavailableException to ExceptionsHelper
quux00 228eed2
Added equals and hashCode to EsqlExecutionInfo.Cluster object.
quux00 fa9c7c4
Minor tweak to esql-across-clusters.asciidoc
quux00 c51719e
Improvements to esql-across-clusters.asciidoc
quux00 d365c37
Update docs/changelog/112595.yaml
quux00 8688dbf
Added questions about took time headers to EsqlResponseListener - pos…
quux00 5b93774
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 449e1a7
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 0083ae7
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 5462d6b
PR feedback with focus on end user docs fixes, removing some out-of-d…
quux00 5f27325
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 9e77c28
Additional PR feedback changes - test adjustments, remove 'set' and '…
quux00 fd6d3bf
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 6e87174
Now tracking took in nanos, not millis (but XContent still displays i…
quux00 b474fc7
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 1649962
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 20c9356
Changed ComputeResponse to de/serialize with read/writeOptionalTimeValue
quux00 e6aa92a
EsqlResponseListener now preferentially uses the took time in the Esq…
quux00 59d1480
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 8bb1b7f
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 6323cdf
Modified esql-across-clusters to run the new queries I added; but JSO…
quux00 4875a66
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 24e0c02
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 940ef22
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 617cbec
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 c45d181
Removed code that lists fully resolved indices in the _clusters/detai…
quux00 13a34de
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 9ac5746
Code cleanup - remove commented out code in IndexResolverTests
quux00 3afb7a1
PR feedback: Moved logic for unavailable/missing clusters to EsqlSession
quux00 b118406
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 fc53eb7
PR feedback: I removed acquireCCSCompute and acquireComputeForDatanod…
quux00 d50658c
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 dc467ac
PR feedback: Created new intf IndicesExpressionResolver and have Remo…
quux00 711c1f8
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 8e5f170
checkstyle fix
quux00 838e6a9
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 77cc107
Moved parseClusterAlias from IndexResolver to RemoteClusterAware and …
quux00 0e71453
Renamed IndicesExpressionResolver intf to IndicesExpressionGrouper.
quux00 a69f3db
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 a7efbec
PR feedback: Added javadoc to ComputeListener, removed leftover debug…
quux00 aa8bbaa
Fixed bug where SKIPPED status for unavailable clusters from field-ca…
quux00 e5e45b5
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 9a304c2
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 d79af98
PR feedback
quux00 826aab7
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 ec99687
Changed status to SKIPPED when no matching index found for remote clu…
quux00 02092e9
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 8569bfa
Merge remote-tracking branch 'elastic/main' into esql/ccs-execution-i…
quux00 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
pr: 112595 | ||
summary: Collect and display execution metadata for ES|QL cross cluster searches | ||
area: ES|QL | ||
type: enhancement | ||
issues: | ||
- 112402 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -85,7 +85,7 @@ POST /_security/role/remote1 | |
"privileges": [ "read","read_cross_cluster" ], <4> | ||
"clusters" : ["my_remote_cluster"] <5> | ||
} | ||
], | ||
], | ||
"remote_cluster": [ <6> | ||
{ | ||
"privileges": [ | ||
|
@@ -100,15 +100,23 @@ POST /_security/role/remote1 | |
---- | ||
|
||
<1> The `cross_cluster_search` cluster privilege is required for the _local_ cluster. | ||
<2> Typically, users will have permissions to read both local and remote indices. However, for cases where the role is intended to ONLY search the remote cluster, the `read` permission is still required for the local cluster. To provide read access to the local cluster, but disallow reading any indices in the local cluster, the `names` field may be an empty string. | ||
<3> The indices allowed read access to the remote cluster. The configured <<security-api-create-cross-cluster-api-key,cross-cluster API key>> must also allow this index to be read. | ||
<4> The `read_cross_cluster` privilege is always required when using {esql} across clusters with the API key based security model. | ||
<2> Typically, users will have permissions to read both local and remote indices. However, for cases where the role | ||
is intended to ONLY search the remote cluster, the `read` permission is still required for the local cluster. | ||
To provide read access to the local cluster, but disallow reading any indices in the local cluster, the `names` | ||
field may be an empty string. | ||
<3> The indices allowed read access to the remote cluster. The configured | ||
<<security-api-create-cross-cluster-api-key,cross-cluster API key>> must also allow this index to be read. | ||
<4> The `read_cross_cluster` privilege is always required when using {esql} across clusters with the API key based | ||
security model. | ||
<5> The remote clusters to which these privileges apply. | ||
This remote cluster must be configured with a <<security-api-create-cross-cluster-api-key,cross-cluster API key>> and connected to the remote cluster before the remote index can be queried. | ||
This remote cluster must be configured with a <<security-api-create-cross-cluster-api-key,cross-cluster API key>> | ||
and connected to the remote cluster before the remote index can be queried. | ||
Verify connection using the <<cluster-remote-info, Remote cluster info>> API. | ||
<6> Required to allow remote enrichment. Without this, the user cannot read from the `.enrich` indices on the remote cluster. The `remote_cluster` security privilege was introduced in version *8.15.0*. | ||
<6> Required to allow remote enrichment. Without this, the user cannot read from the `.enrich` indices on the | ||
remote cluster. The `remote_cluster` security privilege was introduced in version *8.15.0*. | ||
|
||
You will then need a user or API key with the permissions you created above. The following example API call creates a user with the `remote1` role. | ||
You will then need a user or API key with the permissions you created above. The following example API call creates | ||
a user with the `remote1` role. | ||
|
||
[source,console] | ||
---- | ||
|
@@ -119,11 +127,13 @@ POST /_security/user/remote_user | |
} | ||
---- | ||
|
||
Remember that all cross-cluster requests from the local cluster are bound by the cross cluster API key’s privileges, which are controlled by the remote cluster's administrator. | ||
Remember that all cross-cluster requests from the local cluster are bound by the cross cluster API key’s privileges, | ||
which are controlled by the remote cluster's administrator. | ||
|
||
[TIP] | ||
==== | ||
Cross cluster API keys created in versions prior to 8.15.0 will need to replaced or updated to add the new permissions required for {esql} with ENRICH. | ||
Cross cluster API keys created in versions prior to 8.15.0 will need to replaced or updated to add the new permissions | ||
required for {esql} with ENRICH. | ||
==== | ||
|
||
[discrete] | ||
|
@@ -174,6 +184,194 @@ FROM *:my-index-000001 | |
| LIMIT 10 | ||
---- | ||
|
||
[discrete] | ||
[[ccq-cluster-details]] | ||
==== Cross-cluster metadata | ||
|
||
ES|QL {ccs} responses include metadata about the search on each cluster when the response format is JSON. | ||
Here we show an example using the async search endpoint. {ccs-cap} metadata is also present in the synchronous | ||
search endpoint. | ||
|
||
[source,console] | ||
---- | ||
POST /_query/async?format=json | ||
{ | ||
"query": """ | ||
FROM my-index-000001,cluster_one:my-index-000001,cluster_two:my-index* | ||
| STATS COUNT(http.response.status_code) BY user.id | ||
| LIMIT 2 | ||
""" | ||
} | ||
---- | ||
// TEST[setup:my_index] | ||
// TEST[s/cluster_one:my-index-000001,cluster_two:my-index//] | ||
|
||
Which returns: | ||
|
||
[source,console-result] | ||
---- | ||
{ | ||
"is_running": false, | ||
"took": 42, <1> | ||
"columns" : [ | ||
{ | ||
"name" : "COUNT(http.response.status_code)", | ||
"type" : "long" | ||
}, | ||
{ | ||
"name" : "user.id", | ||
"type" : "keyword" | ||
} | ||
], | ||
"values" : [ | ||
[4, "elkbee"], | ||
[1, "kimchy"] | ||
], | ||
"_clusters": { <2> | ||
"total": 3, | ||
"successful": 3, | ||
"running": 0, | ||
"skipped": 0, | ||
"partial": 0, | ||
"failed": 0, | ||
"details": { <3> | ||
"(local)": { <4> | ||
"status": "successful", | ||
"indices": "blogs", | ||
"took": 36, <5> | ||
"_shards": { <6> | ||
"total": 13, | ||
"successful": 13, | ||
"skipped": 0, | ||
"failed": 0 | ||
} | ||
}, | ||
"cluster_one": { | ||
"status": "successful", | ||
"indices": "cluster_one:my-index-000001", | ||
"took": 38, | ||
"_shards": { | ||
"total": 4, | ||
"successful": 4, | ||
"skipped": 0, | ||
"failed": 0 | ||
} | ||
}, | ||
"cluster_two": { | ||
"status": "successful", | ||
"indices": "cluster_two:my-index-000001", <7> | ||
"took": 41, | ||
"_shards": { | ||
"total": 18, | ||
"successful": 18, | ||
"skipped": 1, | ||
"failed": 0 | ||
} | ||
} | ||
} | ||
} | ||
} | ||
---- | ||
// TEST[skip: cross-cluster testing env not set up] | ||
|
||
<1> How long the entire search (across all clusters) took, in milliseconds. | ||
<2> This section of counters shows all possible cluster search states and how many cluster | ||
searches are currently in that state. The clusters can have one of the following statuses: *running*, | ||
*successful* (searches on all shards were successful), *skipped* (the search | ||
failed on a cluster marked with `skip_unavailable`=`true`) or *failed* (the search | ||
failed on a cluster marked with `skip_unavailable`=`false`). | ||
<3> The `_clusters/details` section shows metadata about the search on each cluster. | ||
<4> If you included indices from the local cluster you sent the request to in your {ccs}, | ||
it is identified as "(local)". | ||
<5> How long (in milliseconds) the search took on each cluster. This can be useful to determine | ||
which clusters have slower response times than others. | ||
<6> The shard details for the search on that cluster, including a count of shards that were | ||
skipped due to the can-match phase indicating it had no matching data so it did not need | ||
to be included in the full ES|QL query. | ||
<7> The index expression supplied by the user. If you provide a wildcard such as `my-index*`, | ||
this section will show the resolved index name(s) here, unless no matching indices could | ||
be found on that cluster, in which case the wildcard expression will be retained here. | ||
|
||
|
||
The cross-cluster metadata can be used to determine whether any data came back from a cluster. | ||
For instance in the query below, you see that wildcard expression for `cluster-two` did not | ||
resolve to a concrete index (or indices) and that the total number of shards searched is | ||
zero. This indicates that no matching index was found on that cluster. But since the other | ||
cluster did have a matching index, the search did not return an error, but instead | ||
returned all the matching data it could find. | ||
|
||
|
||
[source,console] | ||
---- | ||
POST /_query/async?format=json | ||
{ | ||
"query": """ | ||
FROM cluster_one:my-index*,cluster_two:logs* | ||
| STATS COUNT(http.response.status_code) BY user.id | ||
| LIMIT 2 | ||
""" | ||
} | ||
---- | ||
// TEST[continued] | ||
// TEST[s/cluster_one:my-index\*,cluster_two:logs\*/my-index-000001/] | ||
|
||
Which returns: | ||
|
||
[source,console-result] | ||
---- | ||
{ | ||
"is_running": false, | ||
"took": 55, | ||
"columns": [ | ||
... // not shown | ||
], | ||
"values": [ | ||
... // not shown | ||
], | ||
"_clusters": { | ||
"total": 2, | ||
"successful": 2, | ||
"running": 0, | ||
"skipped": 0, | ||
"partial": 0, | ||
"failed": 0, | ||
"details": { | ||
"cluster_one": { | ||
"status": "successful", | ||
"indices": "cluster_one:my-index-000001", | ||
"took": 38, | ||
"_shards": { | ||
"total": 4, | ||
"successful": 4, | ||
"skipped": 0, | ||
"failed": 0 | ||
} | ||
}, | ||
"cluster_two": { | ||
"status": "successful", <1> | ||
"indices": "cluster_two:logs*", <2> | ||
"took": 0, | ||
"_shards": { | ||
"total": 0, <3> | ||
"successful": 0, | ||
"skipped": 0, | ||
"failed": 0 | ||
} | ||
} | ||
} | ||
} | ||
} | ||
---- | ||
// TEST[skip: cross-cluster testing env not set up] | ||
|
||
<1> This search is still marked as successful, even though no data was searched. | ||
<2> Since there were no matching indices for the wildcard pattern provided, the original | ||
index expression provided by the user is retained here. | ||
<3> Indicates that no shards were searched (due to not having any matching indices). | ||
|
||
|
||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @naj-h and @tylerperk - Please review proposed end user docs changes. |
||
[discrete] | ||
[[ccq-enrich]] | ||
==== Enrich across clusters | ||
|
@@ -331,8 +529,7 @@ setting. As a result, if a remote cluster specified in the request is | |
unavailable or failed, {ccs} for {esql} queries will fail regardless of the setting. | ||
|
||
We are actively working to align the behavior of {ccs} for {esql} with other | ||
{ccs} APIs. This includes providing detailed execution information for each cluster | ||
in the response, such as execution time, selected target indices, and shards. | ||
{ccs} APIs. | ||
|
||
[discrete] | ||
[[ccq-during-upgrade]] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local cluster should be available, right? Could we remove the multi-cluster output so we get the assertion that the shape is pretty close?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent several hours trying but unless you know of a trick to do clever multi-line matching I don't see how this is possible. Among the things I tried was adding "m" to the end of the matcher to indicate multi-line matching (as in Perl matching), but that doesn't work. Mostly I just get failed runs with no information as to what is wrong.
Plus I'm not really sure it's worth it? The whole point of this section is to show the
_clusters/details
section so testing against a non-CCS set up doesn't seem useful.We probably need another ticket to enable the multi-cluster testing setup that search-across-clusters.asciidoc uses, as that was not set up for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍