Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES|QL Physical plan serialization can be reduced/removed #113809

Open
Tracked by #112938
craigtaverner opened this issue Sep 30, 2024 · 1 comment
Open
Tracked by #112938

ES|QL Physical plan serialization can be reduced/removed #113809

craigtaverner opened this issue Sep 30, 2024 · 1 comment
Labels
:Analytics/ES|QL AKA ESQL Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >tech debt

Comments

@craigtaverner
Copy link
Contributor

In ES|QL the query plan transmitted to the data nodes (serialized) is a physical plan containing a FragmentExec which in turn contains a Logical Plan. This means that only the higher level nodes in the physical plan need to be serialized. Currently we maintain a lot of serialization code, and unit tests for this code, but it can be considered to be dead code.

This issue was first noticed during the development of pushdown to lucene of sorting by distance. At first we added additional serialization to sorts in the EsQueryExec class, and dealt with the daily merge conflicts from the TransportVersions change. But then we realized that this data was never serialized, so we removed the support for all sorts serialization, but in a way that did not require transport version changes (ie. always serialize an empty list, and when deserializing, ignore the results).

However, on thinking more about this we realize that the entire EsQueryExec class itself might never be serialized. We potentially remove a lot of dead serialization code if many classes are never seriealized. We do need to verify the scope of this, and also take into account the pragma node_level_reduction which turns on and off how much of the plan is handled at the data node versus the coordinator node. It is possible that pragma prevents us from removing any serialization, or removing that pragma.

@craigtaverner craigtaverner added :Analytics/ES|QL AKA ESQL >tech debt Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Sep 30, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >tech debt
Projects
None yet
Development

No branches or pull requests

2 participants