Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] DatabaseNodeServiceIT testGzippedDatabase failing #113752

Closed
elasticsearchmachine opened this issue Sep 30, 2024 · 4 comments · Fixed by #115463
Closed

[CI] DatabaseNodeServiceIT testGzippedDatabase failing #113752

elasticsearchmachine opened this issue Sep 30, 2024 · 4 comments · Fixed by #115463
Assignees
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP needs:risk Requires assignment of a risk label (low, medium, blocker) Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI

Comments

@elasticsearchmachine
Copy link
Collaborator

elasticsearchmachine commented Sep 30, 2024

Build Scans:

Reproduction Line:

./gradlew ":modules:ingest-geoip:internalClusterTest" --tests "org.elasticsearch.ingest.geoip.DatabaseNodeServiceIT.testGzippedDatabase" -Dtests.seed=D3143B5657D8D174 -Dtests.locale=sk -Dtests.timezone=Asia/Dushanbe -Druntime.java=22

Applicable branches:
8.x

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

java.lang.AssertionError: null

Issue Reasons:

  • [8.x] 5 failures in test testGzippedDatabase (1.1% fail rate in 451 executions)
  • [8.x] 3 failures in pipeline elasticsearch-periodic-platform-support (30.0% fail rate in 10 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

@elasticsearchmachine elasticsearchmachine added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >test-failure Triaged test failures from CI labels Sep 30, 2024
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Sep 30, 2024
@masseyke masseyke self-assigned this Sep 30, 2024
@masseyke
Copy link
Member

This and #113821 are the same. I've muted both. The problem is that if you hit a seed where the cluster state is updated at the wrong time, then DatabaseNodeService::checkDatabases gets called. And since the downloader is not actually running, the GeoIpTaskState in the cluster state is null. And in that case, we say that there are no validMetadata, and we delete all the databases. So by the time we grab the database to assert things about it, it's gone.
We could enable the downloader, but that causes a whole collection of other problems (search for all the test failures involving the geoip downloader). So at the moment I'm not sure what the best way to go is.

@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch 8.x

Mute Reasons:

  • [8.x] 7 failures in test testGzippedDatabase (0.9% fail rate in 772 executions)
  • [8.x] 2 failures in step part1 (2.5% fail rate in 79 executions)
  • [8.x] 4 failures in pipeline elasticsearch-periodic-platform-support (19.0% fail rate in 21 executions)
  • [8.x] 2 failures in pipeline elasticsearch-intake (2.5% fail rate in 79 executions)

Build Scans:

@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch 8.x

Mute Reasons:

  • [8.x] 5 failures in test testGzippedDatabase (1.1% fail rate in 451 executions)
  • [8.x] 3 failures in pipeline elasticsearch-periodic-platform-support (30.0% fail rate in 10 executions)

Build Scans:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP needs:risk Requires assignment of a risk label (low, medium, blocker) Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI
Projects
None yet
2 participants