Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] Collect and display execution metadata for ES|QL cross cluster searches (#112595) #113820

Merged
merged 1 commit into from
Sep 30, 2024

Commits on Sep 30, 2024

  1. Collect and display execution metadata for ES|QL cross cluster search…

    …es (elastic#112595)
    
    Enhance ES|QL responses to include information about `took` time (search latency), shards, and
    clusters against which the query was executed.
    
    The goal of this PR is to begin to provide parity between the metadata displayed for 
    cross-cluster searches in _search and ES|QL.
    
    This PR adds the following features:
    - add overall `took` time to all ES|QL query responses. And to emphasize: "all" here 
    means: async search, sync search, local-only and cross-cluster searches, so it goes
    beyond just CCS.
    - add `_clusters` metadata to the final response for cross-cluster searches, for both
    async and sync search (see example below)
    - tracking/reporting counts of skipped shards from the can_match (SearchShards API)
    phase of ES|QL processing
    - marking clusters as skipped if they cannot be connected to (during the field-caps
    phase of processing)
    
    Out of scope for this PR:
    - honoring the `skip_unavailable` cluster setting
    - showing `_clusters` metadata in the async response **while** the search is still running
    - showing any shard failure messages (since any shard search failures in ES|QL are
    automatically fatal and _cluster/details is not shown in 4xx/5xx error responses). Note that 
    this also means that the `failed` shard count is always 0 in ES|QL `_clusters` section.
    
    Things changed with respect to behavior in `_search`:
    - the `timed_out` field in `_clusters/details/mycluster` was removed in the ESQL
    response, since ESQL does not support timeouts. It could be added back later
    if/when ESQL supports timeouts.
    - the `failures` array in `_clusters/details/mycluster/_shards` was removed in the ESQL
    response, since any shard failure causes the whole query to fail.
    
    Example output from ES|QL CCS:
    
    ```es
    POST /_query
    {
      "query": "from blogs,remote2:bl*,remote1:blogs|\nkeep authors.first_name,publish_date|\n limit 5"
    }
    ```
    
    ```json
    {
      "took": 49,
      "columns": [
        {
          "name": "authors.first_name",
          "type": "text"
        },
        {
          "name": "publish_date",
          "type": "date"
        }
      ],
      "values": [
        [
          "Tammy",
          "2009-11-04T04:08:07.000Z"
        ],
        [
          "Theresa",
          "2019-05-10T21:22:32.000Z"
        ],
        [
          "Jason",
          "2021-11-23T00:57:30.000Z"
        ],
        [
          "Craig",
          "2019-12-14T21:24:29.000Z"
        ],
        [
          "Alexandra",
          "2013-02-15T18:13:24.000Z"
        ]
      ],
      "_clusters": {
        "total": 3,
        "successful": 2,
        "running": 0,
        "skipped": 1,
        "partial": 0,
        "failed": 0,
        "details": {
          "(local)": {
            "status": "successful",
            "indices": "blogs",
            "took": 43,
            "_shards": {
              "total": 13,
              "successful": 13,
              "skipped": 0,
              "failed": 0
            }
          },
          "remote2": {
            "status": "skipped",  // remote2 was offline when this query was run
            "indices": "remote2:bl*",
            "took": 0,
            "_shards": {
              "total": 0,
              "successful": 0,
              "skipped": 0,
              "failed": 0
            }
          },
          "remote1": {
            "status": "successful",
            "indices": "remote1:blogs",
            "took": 47,
            "_shards": {
              "total": 13,
              "successful": 13,
              "skipped": 0,
              "failed": 0
            }
          }
        }
      }
    }
    ```
    
    Fixes elastic#112402 and elastic#110935
    quux00 committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    f25289c View commit details
    Browse the repository at this point in the history