Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade main to Lucene 10 #113617

Draft
wants to merge 492 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
492 commits
Select commit Hold shift + click to select a range
53bd6eb
[Automated] Update Lucene snapshot to 10.0.0-snapshot-9172cc42472
elasticsearchmachine Aug 30, 2024
cb39399
[Automated] Update Lucene snapshot to 9.12.0-snapshot-f23711a3e36
elasticsearchmachine Aug 30, 2024
b49ac93
Make docvalue skipper false in FieldTypeTestCase
iverase Aug 30, 2024
480c045
Reword javadoc text for SearchServiceTests#testSlicingBehaviourForPar…
javanna Aug 30, 2024
dfa82d7
Address SearchServiceTests compile errors caused by removal of IndexS…
javanna Aug 30, 2024
54ed24b
TimeLimitingCollector has been removed.
ChrisHegarty Aug 30, 2024
749020c
Update Version in LiveVersionMapTests and SegmentTests
ChrisHegarty Aug 30, 2024
26456b1
Resolve compile errors in StemmerTokenFilterFactory
javanna Aug 30, 2024
92bf5ad
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Aug 30, 2024
dd516e7
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Aug 30, 2024
7017588
Add Lucene 10.0.0 IndevVersion
ChrisHegarty Aug 30, 2024
69c1658
Determinize the system index indexPatternAutomaton
ChrisHegarty Aug 30, 2024
6e2f8ef
Fix compile error in AutomataMatch around ByteRunAutomaton creation
javanna Aug 30, 2024
4222481
Add missing determinize call in RLikePattern
javanna Aug 30, 2024
d8cbc15
Add missing determinize call in ExpressionBuilder#visitQualifiedNameP…
javanna Aug 30, 2024
6cd8dce
Replace NOCOMMIT comments with //TODO Lucene 10 upgrade to be able to…
javanna Aug 30, 2024
94c21b6
spotless
javanna Aug 30, 2024
4e6c354
[Automated] Update Lucene snapshot to 10.0.0-snapshot-ce4f56e74ad
elasticsearchmachine Aug 31, 2024
3be4e65
[Automated] Update Lucene snapshot to 9.12.0-snapshot-f23711a3e36
elasticsearchmachine Aug 31, 2024
8f7c041
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Aug 31, 2024
23b2b3d
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Aug 31, 2024
205b95e
Fix NestedHelper
ChrisHegarty Aug 31, 2024
10a85aa
[Automated] Update Lucene snapshot to 10.0.0-snapshot-ce4f56e74ad
elasticsearchmachine Sep 1, 2024
c6bf30f
[Automated] Update Lucene snapshot to 9.12.0-snapshot-f23711a3e36
elasticsearchmachine Sep 1, 2024
feb1298
Unused import
ChrisHegarty Sep 1, 2024
8f10f80
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 1, 2024
b29051a
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 1, 2024
a27a97e
Fix term vector access in IT
ChrisHegarty Sep 1, 2024
2fe4b60
Fix SearchQueryIT javadoc failure
ChrisHegarty Sep 1, 2024
faf42c5
spotless
ChrisHegarty Sep 1, 2024
1820f0f
[Automated] Update Lucene snapshot to 10.0.0-snapshot-ce4f56e74ad
elasticsearchmachine Sep 2, 2024
7a4f30b
[Automated] Update Lucene snapshot to 9.12.0-snapshot-f23711a3e36
elasticsearchmachine Sep 2, 2024
ef9af35
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 2, 2024
1f8c238
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 2, 2024
9c12380
Add Awaits fix for ESQL LogicalPlanOptimizerTest
ChrisHegarty Sep 2, 2024
35f980a
More ESQL Regex awaits fixes
ChrisHegarty Sep 2, 2024
c5cc740
Fix CompositeValuesCollectorQueueTests mock and spotless
ChrisHegarty Sep 2, 2024
6eb5b29
Add DocValueSkipper to ES87TSDBDocValuesFormat
iverase Sep 2, 2024
40e35ad
Change IndexReader with LeafReader for mocking
iverase Sep 2, 2024
cdac9a5
Fix KeyedFlattenedLeafFieldData iteration
iverase Sep 2, 2024
36e153c
Fix KeyedFlattenedLeafFieldData iteration second try
iverase Sep 2, 2024
34ec3a9
Fix KeyedFlattenedLeafFieldDataTests
iverase Sep 2, 2024
db740b6
Fix QueryStringQueryBuilderTests
cbuescher Sep 2, 2024
eff329f
Fix SystemIndexDescriptor
cbuescher Sep 2, 2024
adc0f33
Fix Regex class automaton determinization
cbuescher Sep 2, 2024
d070e8d
Fix docValueCount in KeyedFlattenedLeafFieldData
iverase Sep 2, 2024
3352358
Fix TimeSeriesRateAggregatorTests
iverase Sep 2, 2024
e638acf
Fix undeterminized automaton in UnmappedFieldFetcher
cbuescher Sep 2, 2024
5a83198
Fix automaton determinization in AutomatonQueries
cbuescher Sep 2, 2024
eb276e6
Fix IOContext in Store
cbuescher Sep 2, 2024
73bbe0a
[Automated] Update Lucene snapshot to 9.12.0-snapshot-09dd985bef9
elasticsearchmachine Sep 3, 2024
4bfdcf9
[Automated] Update Lucene snapshot to 10.0.0-snapshot-68cc8734ca2
elasticsearchmachine Sep 3, 2024
41c0840
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 3, 2024
c0f8c92
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 3, 2024
c12da0a
Mute MetadataCreateIndexServiceTests.testValidateDotIndex because of …
cbuescher Sep 3, 2024
7e409ba
Add Lucene 10 todo to Store
ChrisHegarty Sep 3, 2024
432df13
spotless
ChrisHegarty Sep 3, 2024
52429c4
Add romaniannormalization to list of known token filters
cbuescher Sep 3, 2024
14b8525
Fix CategoryContextMappingTests
cbuescher Sep 3, 2024
bf1f92d
[Automated] Update Lucene snapshot to 9.12.0-snapshot-09dd985bef9
elasticsearchmachine Sep 4, 2024
35f1b24
[Automated] Update Lucene snapshot to 10.0.0-snapshot-c21bc5405be
elasticsearchmachine Sep 4, 2024
22609ed
fix MultiValueMode
iverase Sep 4, 2024
7779d35
Fix MatchBoolPrefixQueryBuilderTests
cbuescher Sep 3, 2024
d299743
Don't mark romaniannormalization token filter as exposed
cbuescher Sep 4, 2024
992e3bc
fix MultiOrdinals #docValueCount
iverase Sep 4, 2024
8db0539
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 4, 2024
3570257
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 4, 2024
4e9bd3f
Fix AbstractStringFieldDataTestCase#testGlobalOrdinals
cbuescher Sep 4, 2024
d440438
Fix compile issue due to missing Scorable#docID method
cbuescher Sep 4, 2024
5ef57b2
spotless
ChrisHegarty Sep 4, 2024
8e39659
Fix MultiBucketColletorTests
ChrisHegarty Sep 4, 2024
5841c97
Fix or mute remaining failing :server test
cbuescher Sep 4, 2024
481f78e
Fix APMTracer by determinizing automata
cbuescher Sep 4, 2024
693df5b
[Automated] Update Lucene snapshot to 10.0.0-snapshot-b91b4136aff
elasticsearchmachine Sep 5, 2024
97fe576
[Automated] Update Lucene snapshot to 9.12.0-snapshot-56468ea3bb8
elasticsearchmachine Sep 5, 2024
e02cb0f
Add awaits fix bugURL
ChrisHegarty Sep 5, 2024
4bc8e00
Removing IndexVersions.V_7_0_0 and IndexVersions.V_7_1_0
cbuescher Sep 5, 2024
5146b78
Delete IndexVersions.V_7_2_0 - V_7_5_2
cbuescher Sep 5, 2024
9e8ac1e
[Automated] Update Lucene snapshot to 10.0.0-snapshot-5f242b3b268
elasticsearchmachine Sep 6, 2024
e382a03
[Automated] Update Lucene snapshot to 9.12.0-snapshot-40c4e582cf9
elasticsearchmachine Sep 6, 2024
a70eec5
Update usage of Automaton Operations.subsetOf in x-pack core
ChrisHegarty Sep 6, 2024
f5e7147
Mute test that uses removed complement syntax
ChrisHegarty Sep 6, 2024
c96b78e
A couple more Automaton subsetOf
ChrisHegarty Sep 6, 2024
1ea377d
Update usage of Automaton sameLanguage in test
ChrisHegarty Sep 6, 2024
15dd0a8
[Automated] Update Lucene snapshot to 10.0.0-snapshot-dc47adbbe73
elasticsearchmachine Sep 7, 2024
79010f4
[Automated] Update Lucene snapshot to 9.12.0-snapshot-ef5d0f2729a
elasticsearchmachine Sep 7, 2024
697fc67
[Automated] Update Lucene snapshot to 10.0.0-snapshot-dc47adbbe73
elasticsearchmachine Sep 8, 2024
f4197a8
[Automated] Update Lucene snapshot to 9.12.0-snapshot-371fa57d9c7
elasticsearchmachine Sep 8, 2024
5ff75bb
[Automated] Update Lucene snapshot to 10.0.0-snapshot-dc47adbbe73
elasticsearchmachine Sep 9, 2024
e0e2e1e
[Automated] Update Lucene snapshot to 9.12.0-snapshot-371fa57d9c7
elasticsearchmachine Sep 9, 2024
3267fd0
Remove IndexVersions.V_7_6_0
cbuescher Sep 9, 2024
74040a5
Remove IndexVersions.V_7_7_0 - V_7_11_0
cbuescher Sep 9, 2024
70ee548
Remove remainign V_7x IndexVersions
cbuescher Sep 9, 2024
db7463f
Uncomment some tests for better mergability with main
cbuescher Sep 9, 2024
95bc9d3
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 9, 2024
d0a1110
Fix errors in imports
cbuescher Sep 9, 2024
afec00b
[Automated] Update Lucene snapshot to 10.0.0-snapshot-64f5697f537
elasticsearchmachine Sep 10, 2024
304a1c6
[Automated] Update Lucene snapshot to 9.12.0-snapshot-ce23e15eb54
elasticsearchmachine Sep 10, 2024
d104bc3
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 10, 2024
8fe6c3a
Remove usages of random versions before V_8_0_0 in tests
cbuescher Sep 10, 2024
8a01884
More test fixed related to legacy versions
cbuescher Sep 10, 2024
09b68e2
Mute CompositeRolesStoreTests.testXPackUserCanAccessNonRestrictedIndices
cbuescher Sep 10, 2024
ca97d86
Fix failing knn yaml test
cbuescher Sep 10, 2024
84d4e86
Fix docs test related to knn queries
cbuescher Sep 10, 2024
77f24ae
Null out missing codec in BWCLucene70Codec
cbuescher Sep 10, 2024
3cea216
Partially revert previous commit
cbuescher Sep 10, 2024
8bf41b8
[Automated] Update Lucene snapshot to 9.12.0-snapshot-7964682ddf5
elasticsearchmachine Sep 11, 2024
fff3fbb
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7c529ce092d
elasticsearchmachine Sep 11, 2024
0ee5e4f
Fix compilation issues after last Lucene 10 snapshot merge
cbuescher Sep 11, 2024
2d31a70
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 11, 2024
ced0f7b
Fix UOE in ContextIndexSearcher after last Lucene merge
cbuescher Sep 11, 2024
2430fff
[Automated] Update Lucene snapshot to 10.0.0-snapshot-74e3c44063a
elasticsearchmachine Sep 12, 2024
d45dbc7
[Automated] Update Lucene snapshot to 9.12.0-snapshot-ab262f917d4
elasticsearchmachine Sep 12, 2024
d24200e
Fix compile issues after latest Lucene snapshot update
cbuescher Sep 12, 2024
42c4ed1
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 12, 2024
8800107
spotless
cbuescher Sep 12, 2024
84bb2b4
Fixing more checkstyle and spotless issues after merging main
cbuescher Sep 12, 2024
e916319
Remove awaitsFix, as the issue was fixed
mayya-sharipova Sep 12, 2024
c54686b
Override the correct search method
javanna Sep 12, 2024
28846be
Follow up changes to inter-segment concurrency changes
cbuescher Sep 12, 2024
07fdd8b
[Automated] Update Lucene snapshot to 10.0.0-snapshot-5045d3c67b1
elasticsearchmachine Sep 13, 2024
7dca607
[Automated] Update Lucene snapshot to 9.12.0-snapshot-6cc4f13ab22
elasticsearchmachine Sep 13, 2024
8cb58a7
Use RegExp.DEPRECATED_COMPLEMENT where needed
cbuescher Sep 13, 2024
32f5907
Unmute two tests that now pass
cbuescher Sep 13, 2024
c79f782
Unmute a couple more tests that now pass
ChrisHegarty Sep 13, 2024
014d338
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7c056ab88c7
elasticsearchmachine Sep 14, 2024
35764df
[Automated] Update Lucene snapshot to 9.12.0-snapshot-1b38d5dec85
elasticsearchmachine Sep 14, 2024
e76b6f6
[Automated] Update Lucene snapshot to 10.0.0-snapshot-568d1f3fbe7
elasticsearchmachine Sep 15, 2024
00ae8a4
[Automated] Update Lucene snapshot to 9.12.0-snapshot-9cd6a24be43
elasticsearchmachine Sep 15, 2024
40aa2f9
Fix getDiscountOverlaps in LegacyBM25Similarity
ChrisHegarty Sep 15, 2024
b05b5f3
Fix AggregatorTestCase with LeafReaderContextPartition
ChrisHegarty Sep 15, 2024
7d27f53
More LeafReaderContextPartition refactoring fixes
ChrisHegarty Sep 15, 2024
12b98e2
Merge branch 'main' into lucene_snapshot_10
elasticsearchmachine Sep 15, 2024
70a5dba
[Automated] Update Lucene snapshot to 10.0.0-snapshot-3801d859783
elasticsearchmachine Sep 16, 2024
4b2f5f1
lucene_snapshot: Fix constructor chaining in LegacyBM25Similarity
elasticsearchmachine Sep 16, 2024
762ec3e
lucene_snapshot_10: fix license headers
elasticsearchmachine Sep 16, 2024
b3f371d
Move AggregatorTestCase to search(Query, CollectorManager)
javanna Sep 16, 2024
b215944
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 16, 2024
6a58074
Don't randomize LuceneTestCase concurrency when using "newSearcher"
cbuescher Sep 16, 2024
6fc5930
Fix docs according to changes in Lovins token filter, Pathhierarchy A…
cbuescher Sep 16, 2024
2d1a614
Fix geo_shape related docs tests
cbuescher Sep 16, 2024
52bf71d
[Automated] Update Lucene snapshot to 10.0.0-snapshot-f4ebed2404e
elasticsearchmachine Sep 17, 2024
b4546db
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 17, 2024
592642b
Fix expected output of romanian analyzer
cbuescher Sep 17, 2024
4f7df52
Fix QueryTranslatorSpecTests due to changes in regex syntax flags
cbuescher Sep 17, 2024
4ebfae1
Fix 370_profile yaml test for yamlRestCompatTest
cbuescher Sep 17, 2024
a6f6d21
Fix romanian analyzer restBwc test
cbuescher Sep 17, 2024
309abb0
Remove v7.17.13 bwc tasks in CI
cbuescher Sep 17, 2024
98e5600
Determinize automaton produced by IncludeExclude
cbuescher Sep 17, 2024
455980d
Fix persian language analyzer doc by adding stemmer
cbuescher Sep 17, 2024
01127d1
[Automated] Update Lucene snapshot to 10.0.0-snapshot-b59a357e586
elasticsearchmachine Sep 18, 2024
6958086
Fix compile errors after L10 snapshot merge
cbuescher Sep 18, 2024
9af0d8e
[Automated] Update Lucene snapshot to 9.12.0-snapshot-a774a998be1
elasticsearchmachine Sep 16, 2024
6294ad2
lucene_snapshot: Fix constructor chaining in LegacyBM25Similarity
elasticsearchmachine Sep 16, 2024
f5ce091
[Automated] Update Lucene snapshot to 9.12.0-snapshot-cd7a74cb4d4
elasticsearchmachine Sep 17, 2024
75fcbe0
[Automated] Update Lucene snapshot to 9.12.0-snapshot-71ca6b4bb16
elasticsearchmachine Sep 18, 2024
6e40125
lucene_snapshot: fix another instance of IOContext.READONCE
elasticsearchmachine Sep 18, 2024
8aa9cce
Merge branch 'main' into lucene_snapshot_new
elasticsearchmachine Sep 18, 2024
ff74c90
lucene_snapshot: fix license headers
elasticsearchmachine Sep 15, 2024
26b6513
[Automated] Update Lucene snapshot to 10.0.0-snapshot-6d987e1ce1c
elasticsearchmachine Sep 19, 2024
fb44c63
[Automated] Update Lucene snapshot to 9.12.0-snapshot-b467a2bb66d
elasticsearchmachine Sep 19, 2024
d1fbaab
Merge branch 'main' into lucene_snapshot
ChrisHegarty Sep 19, 2024
1a8c3b1
Merge branch 'main' into lucene_snapshot_10
cbuescher Sep 19, 2024
9eec2c4
Add a capability and transport version for new regex and range interv…
ChrisHegarty Sep 19, 2024
7150729
Multi term intervals: increase max_expansions (#112826)
mayya-sharipova Sep 19, 2024
0085911
[Automated] Update Lucene snapshot to 10.0.0-snapshot-e4ac57746eb
elasticsearchmachine Sep 20, 2024
1e3d353
[Automated] Update Lucene snapshot to 9.12.0-snapshot-a7ce3466d7c
elasticsearchmachine Sep 20, 2024
3bfd004
Adapt QueryAnalyzer to use TermInSetQuery#getBytesRefIterator
javanna Sep 20, 2024
bbac749
restore ngram tokenizer removed due to a bad merge
javanna Sep 20, 2024
952aa9c
Merge branch 'main' into lucene_snapshot
ChrisHegarty Sep 20, 2024
e5f4ef9
Address norwegian stemmer creation issues
javanna Sep 20, 2024
8e51178
Merge branch 'main' into lucene_snapshot_10
javanna Sep 20, 2024
c03d83d
extend ESTestCase#newSearcher methods and add javadocs
javanna Sep 20, 2024
27139fc
Merge branch 'main' into lucene_snapshot
ChrisHegarty Sep 20, 2024
2318da1
Rephrase comment in OldCodecsAvailableTests
javanna Sep 20, 2024
46249a0
clarify comment in BWCLucene70Codec
javanna Sep 20, 2024
352b10a
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 20, 2024
1d2737f
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 20, 2024
266979f
Fix READONCE IOContext usage
ChrisHegarty Sep 20, 2024
c746906
fix TransportSimulateBulkActionIT compilation
javanna Sep 20, 2024
4512167
another attempt to fix READONCE IOContext usage
ChrisHegarty Sep 20, 2024
c0b6794
spotless
ChrisHegarty Sep 20, 2024
b7574b5
Update docs/changelog/113018.yaml
ChrisHegarty Sep 20, 2024
aaf1bbc
Use the RC build
ChrisHegarty Sep 20, 2024
89c4af7
Address test failures in old-lucene-versions
javanna Sep 20, 2024
989e48a
Address test failure in BlockPostingsFormat3Tests
javanna Sep 20, 2024
08a78e9
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 21, 2024
6a32106
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 21, 2024
b594455
Update docs/changelog/113333.yaml
ChrisHegarty Sep 21, 2024
543d0c3
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 21, 2024
b0ec6c0
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 21, 2024
2a87ecc
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 21, 2024
cf56c9b
remove erroneous changelog
ChrisHegarty Sep 21, 2024
ceaf86e
Address WildcardFieldMapperTests failure
javanna Sep 21, 2024
303f22d
Multi term intervals: increase max_expansions (#112826)
mayya-sharipova Sep 19, 2024
835c114
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 21, 2024
14451f2
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 22, 2024
e1dcf11
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 22, 2024
5d79230
Merge branch 'main' into lucene_snapshot_9_12
ChrisHegarty Sep 22, 2024
435f0c2
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 22, 2024
f3b96e8
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 22, 2024
e3c24b2
fix WildcardFieldMapperTests to include
ChrisHegarty Sep 22, 2024
c74d361
Fix docs build
ChrisHegarty Sep 22, 2024
3a0ff7d
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
0794124
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
8ef1fcd
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
2f56034
Merge branch 'main' into lucene_snapshot_9_12
elasticmachine Sep 22, 2024
254d82f
Merge branch 'lucene_snapshot_9_12' into lucene_snapshot_10
ChrisHegarty Sep 22, 2024
30d23b2
revert
ChrisHegarty Sep 22, 2024
c9cb409
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 23, 2024
2d25cbc
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 23, 2024
8b32340
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 23, 2024
22d88d2
Merge branch 'main' into lucene_snapshot_10
javanna Sep 23, 2024
381132c
Restore index versions 7 in lucene_snapshot_10 (#113317)
javanna Sep 23, 2024
12dfe38
Add Lucene70DocValuesFormat to old-lucene-versions plugin (#113377)
ChrisHegarty Sep 23, 2024
8ecb407
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 24, 2024
147eb47
[Automated] Update Lucene snapshot to 9.12.0-snapshot-11c4f071a7a
elasticsearchmachine Sep 24, 2024
fcd9a0e
Merge branch 'main' into lucene_snapshot_10
ChrisHegarty Sep 24, 2024
cad6c6c
Merge remote-tracking branch 'origin/main' into lucene_snapshot_10
elasticsearchmachine Sep 24, 2024
1e62951
Remove leftover TODO
javanna Sep 24, 2024
c7c24b0
Remove TODO in RegexpFlag
ChrisHegarty Sep 24, 2024
eb06cec
Add UpdateForV10 annotation
javanna Sep 24, 2024
871c430
Merge remote-tracking branch 'upstream/lucene_snapshot_10' into lucen…
brianseeders Sep 24, 2024
f7be20d
Merge branch 'main' into lucene_snapshot
javanna Sep 24, 2024
7a8cab6
Revert needless jdk change in legacy file
javanna Sep 24, 2024
5490a47
[Automated] Update Lucene snapshot to 10.0.0-snapshot-53d1c2bd2fb
elasticsearchmachine Sep 25, 2024
3ea0406
Merge branch 'main' into lucene_snapshot
javanna Sep 25, 2024
b295803
Fix compile issues after last merge with main
cbuescher Sep 25, 2024
518fb08
[Automated] Update Lucene snapshot to 10.0.0-snapshot-ff57fa7b423
elasticsearchmachine Sep 26, 2024
05b4b6e
Merge branch 'main' into lucene_snapshot
cbuescher Sep 26, 2024
394a063
Merge branch 'main' into lucene_snapshot
javanna Sep 26, 2024
52de43f
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7b4b0238d70
elasticsearchmachine Sep 27, 2024
b723551
Merge branch 'main' into lucene_snapshot
javanna Sep 27, 2024
bbc9cf3
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 27, 2024
731219a
update profile tests
javanna Sep 27, 2024
c064636
add lucene 10 upgrade node feature and fix profile yaml test
javanna Sep 27, 2024
c75e0c5
Merge branch 'main' into lucene_snapshot
javanna Sep 27, 2024
2b15163
restore replaceValueInMatch for profile description tests from 8.x
javanna Sep 27, 2024
d912d7b
[Automated] Update Lucene snapshot to 10.0.0-snapshot-7b4b0238d70
elasticsearchmachine Sep 28, 2024
f93db49
[Automated] Update Lucene snapshot to 10.0.0-snapshot-0a8604d908c
elasticsearchmachine Sep 29, 2024
4a79e51
Revert "[Automated] Update Lucene snapshot to 10.0.0-snapshot-0a8604d…
javanna Sep 29, 2024
db54b81
Merge branch 'main' into lucene_snapshot
javanna Sep 29, 2024
1dc8b4c
[Automated] Update Lucene snapshot to 10.0.0-snapshot-22ac47c07ad
elasticsearchmachine Sep 30, 2024
b6532e6
Merge remote-tracking branch 'origin/main' into lucene_snapshot
elasticsearchmachine Sep 30, 2024
63e524d
Address compile errors after vector api changes upstream (#113766)
javanna Sep 30, 2024
2471dc9
Update lucene snapshot buildkite config to build from branch_10_0 (#1…
javanna Sep 30, 2024
af93513
Merge branch 'main' into lucene_snapshot
javanna Sep 30, 2024
a969b1d
Merge branch 'main' into lucene_snapshot
javanna Sep 30, 2024
79d3d6a
Make dutch_kp and lovins no op token filters
javanna Sep 30, 2024
a20bf84
Fix needless use of concurrent collector managers (#113739)
original-brownbear Oct 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 4 additions & 2 deletions .buildkite/pipelines/lucene-snapshot/build-snapshot.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
steps:
- trigger: apache-lucene-build-snapshot
label: Trigger pipeline to build lucene snapshot
label: Trigger pipeline to build lucene 10 snapshot
key: lucene-build
if: build.env("LUCENE_BUILD_ID") == null || build.env("LUCENE_BUILD_ID") == ""
if: (build.env("LUCENE_BUILD_ID") == null || build.env("LUCENE_BUILD_ID") == "")
build:
branch: branch_10_0
- wait
- label: Upload and update lucene snapshot
command: .buildkite/scripts/lucene-snapshot/upload-snapshot.sh
Expand Down
1 change: 0 additions & 1 deletion .buildkite/pipelines/lucene-snapshot/run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,6 @@ steps:
matrix:
setup:
BWC_VERSION:
- 7.17.13
- 8.9.1
- 8.10.0
agents:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
import org.apache.lucene.store.MMapDirectory;
import org.apache.lucene.util.hnsw.RandomVectorScorer;
import org.apache.lucene.util.hnsw.RandomVectorScorerSupplier;
import org.apache.lucene.util.quantization.RandomAccessQuantizedByteVectorValues;
import org.apache.lucene.util.quantization.QuantizedByteVectorValues;
import org.apache.lucene.util.quantization.ScalarQuantizer;
import org.elasticsearch.common.logging.LogConfigurator;
import org.elasticsearch.core.IOUtils;
Expand Down Expand Up @@ -217,19 +217,17 @@ public float squareDistanceScalar() {
return 1 / (1f + adjustedDistance);
}

RandomAccessQuantizedByteVectorValues vectorValues(int dims, int size, IndexInput in, VectorSimilarityFunction sim) throws IOException {
QuantizedByteVectorValues vectorValues(int dims, int size, IndexInput in, VectorSimilarityFunction sim) throws IOException {
var sq = new ScalarQuantizer(0.1f, 0.9f, (byte) 7);
var slice = in.slice("values", 0, in.length());
return new OffHeapQuantizedByteVectorValues.DenseOffHeapVectorValues(dims, size, sq, false, sim, null, slice);
}

RandomVectorScorerSupplier luceneScoreSupplier(RandomAccessQuantizedByteVectorValues values, VectorSimilarityFunction sim)
throws IOException {
RandomVectorScorerSupplier luceneScoreSupplier(QuantizedByteVectorValues values, VectorSimilarityFunction sim) throws IOException {
return new Lucene99ScalarQuantizedVectorScorer(null).getRandomVectorScorerSupplier(sim, values);
}

RandomVectorScorer luceneScorer(RandomAccessQuantizedByteVectorValues values, VectorSimilarityFunction sim, float[] queryVec)
throws IOException {
RandomVectorScorer luceneScorer(QuantizedByteVectorValues values, VectorSimilarityFunction sim, float[] queryVec) throws IOException {
return new Lucene99ScalarQuantizedVectorScorer(null).getRandomVectorScorer(sim, values, queryVec);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,6 @@ org.apache.lucene.util.Version#parseLeniently(java.lang.String)

org.apache.lucene.index.NoMergePolicy#INSTANCE @ explicit use of NoMergePolicy risks forgetting to configure NoMergeScheduler; use org.elasticsearch.common.lucene.Lucene#indexWriterConfigWithNoMerging() instead.

@defaultMessage Spawns a new thread which is solely under lucenes control use ThreadPool#relativeTimeInMillis instead
org.apache.lucene.search.TimeLimitingCollector#getGlobalTimerThread()
org.apache.lucene.search.TimeLimitingCollector#getGlobalCounter()

@defaultMessage Don't interrupt threads use FutureUtils#cancel(Future<T>) instead
java.util.concurrent.Future#cancel(boolean)

Expand Down
2 changes: 1 addition & 1 deletion build-tools-internal/version.properties
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
elasticsearch = 9.0.0
lucene = 9.11.1
lucene = 10.0.0-snapshot-22ac47c07ad

bundled_jdk_vendor = openjdk
bundled_jdk = 22.0.1+8@c7ec1332f7bb44aeba2eb341ae18aca4
Expand Down
4 changes: 2 additions & 2 deletions docs/Versions.asciidoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@

include::{docs-root}/shared/versions/stack/{source_branch}.asciidoc[]

:lucene_version: 9.11.1
:lucene_version_path: 9_11_1
:lucene_version: 10.0.0
:lucene_version_path: 10_0_0
:jdk: 11.0.2
:jdk_major: 11
:build_type: tar
Expand Down
5 changes: 5 additions & 0 deletions docs/changelog/111465.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 111465
summary: Add range and regexp Intervals
area: Search
type: enhancement
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/112826.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 112826
summary: "Multi term intervals: increase max_expansions"
area: Search
type: enhancement
issues:
- 110491
5 changes: 5 additions & 0 deletions docs/changelog/113333.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 113333
summary: Upgrade to Lucene 9.12
area: Search
type: upgrade
issues: []
12 changes: 6 additions & 6 deletions docs/plugins/analysis-nori.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -244,11 +244,11 @@ Which responds with:
"end_offset": 3,
"type": "word",
"position": 1,
"leftPOS": "J(Ending Particle)",
"leftPOS": "JKS(Subject case marker)",
"morphemes": null,
"posType": "MORPHEME",
"reading": null,
"rightPOS": "J(Ending Particle)"
"rightPOS": "JKS(Subject case marker)"
},
{
"token": "깊",
Expand All @@ -268,11 +268,11 @@ Which responds with:
"end_offset": 6,
"type": "word",
"position": 3,
"leftPOS": "E(Verbal endings)",
"leftPOS": "ETM(Adnominal form transformative ending)",
"morphemes": null,
"posType": "MORPHEME",
"reading": null,
"rightPOS": "E(Verbal endings)"
"rightPOS": "ETM(Adnominal form transformative ending)"
},
{
"token": "나무",
Expand All @@ -292,11 +292,11 @@ Which responds with:
"end_offset": 10,
"type": "word",
"position": 5,
"leftPOS": "J(Ending Particle)",
"leftPOS": "JX(Auxiliary postpositional particle)",
"morphemes": null,
"posType": "MORPHEME",
"reading": null,
"rightPOS": "J(Ending Particle)"
"rightPOS": "JX(Auxiliary postpositional particle)"
}
]
},
Expand Down
3 changes: 2 additions & 1 deletion docs/reference/analysis/analyzers/lang-analyzer.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1430,7 +1430,8 @@ PUT /persian_example
"decimal_digit",
"arabic_normalization",
"persian_normalization",
"persian_stop"
"persian_stop",
"persian_stem"
]
}
}
Expand Down
24 changes: 12 additions & 12 deletions docs/reference/analysis/tokenizers/pathhierarchy-tokenizer.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,14 @@ POST _analyze
"start_offset": 0,
"end_offset": 8,
"type": "word",
"position": 0
"position": 1
},
{
"token": "/one/two/three",
"start_offset": 0,
"end_offset": 14,
"type": "word",
"position": 0
"position": 2
}
]
}
Expand Down Expand Up @@ -144,14 +144,14 @@ POST my-index-000001/_analyze
"start_offset": 7,
"end_offset": 18,
"type": "word",
"position": 0
"position": 1
},
{
"token": "/three/four/five",
"start_offset": 7,
"end_offset": 23,
"type": "word",
"position": 0
"position": 2
}
]
}
Expand All @@ -178,14 +178,14 @@ If we were to set `reverse` to `true`, it would produce the following:
[[analysis-pathhierarchy-tokenizer-detailed-examples]]
=== Detailed examples

A common use-case for the `path_hierarchy` tokenizer is filtering results by
file paths. If indexing a file path along with the data, the use of the
`path_hierarchy` tokenizer to analyze the path allows filtering the results
A common use-case for the `path_hierarchy` tokenizer is filtering results by
file paths. If indexing a file path along with the data, the use of the
`path_hierarchy` tokenizer to analyze the path allows filtering the results
by different parts of the file path string.


This example configures an index to have two custom analyzers and applies
those analyzers to multifields of the `file_path` text field that will
those analyzers to multifields of the `file_path` text field that will
store filenames. One of the two analyzers uses reverse tokenization.
Some sample documents are then indexed to represent some file paths
for photos inside photo folders of two different users.
Expand Down Expand Up @@ -264,8 +264,8 @@ POST file-path-test/_doc/5
--------------------------------------------------


A search for a particular file path string against the text field matches all
the example documents, with Bob's documents ranking highest due to `bob` also
A search for a particular file path string against the text field matches all
the example documents, with Bob's documents ranking highest due to `bob` also
being one of the terms created by the standard analyzer boosting relevance for
Bob's documents.

Expand Down Expand Up @@ -301,7 +301,7 @@ GET file-path-test/_search
With the reverse parameter for this tokenizer, it's also possible to match
from the other end of the file path, such as individual file names or a deep
level subdirectory. The following example shows a search for all files named
`my_photo1.jpg` within any directory via the `file_path.tree_reversed` field
`my_photo1.jpg` within any directory via the `file_path.tree_reversed` field
configured to use the reverse parameter in the mapping.


Expand Down Expand Up @@ -342,7 +342,7 @@ POST file-path-test/_analyze


It's also useful to be able to filter with file paths when combined with other
types of searches, such as this example looking for any files paths with `16`
types of searches, such as this example looking for any files paths with `16`
that also must be in Alice's photo directory.

[source,console]
Expand Down
8 changes: 1 addition & 7 deletions docs/reference/modules/threadpool.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,10 @@ There are several thread pools, but the important ones include:

[[search-threadpool]]
`search`::
For coordination of count/search operations at the shard level whose computation
is offloaded to the search_worker thread pool. Used also by fetch and other search
For count/search operations at the shard level. Used also by fetch and other search
related operations Thread pool type is `fixed` with a size of `int((`<<node.processors,
`# of allocated processors`>>`pass:[ * ]3) / 2) + 1`, and queue_size of `1000`.

`search_worker`::
For the heavy workload of count/search operations that may be executed concurrently
across segments within the same shard when possible. Thread pool type is `fixed`
with a size of `int((`<<node.processors, `# of allocated processors`>>`pass:[ * ]3) / 2) + 1`, and unbounded queue_size .

[[search-throttled]]`search_throttled`::
For count/search/suggest/get operations on `search_throttled indices`.
Thread pool type is `fixed` with a size of `1`, and queue_size of `100`.
Expand Down
80 changes: 76 additions & 4 deletions docs/reference/query-dsl/intervals-query.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,9 @@ Valid rules include:
* <<intervals-match,`match`>>
* <<intervals-prefix,`prefix`>>
* <<intervals-wildcard,`wildcard`>>
* <<intervals-regexp,`regexp`>>
* <<intervals-fuzzy,`fuzzy`>>
* <<intervals-range,`range`>>
* <<intervals-all_of,`all_of`>>
* <<intervals-any_of,`any_of`>>
--
Expand Down Expand Up @@ -122,8 +124,9 @@ unstemmed ones.
==== `prefix` rule parameters

The `prefix` rule matches terms that start with a specified set of characters.
This prefix can expand to match at most 128 terms. If the prefix matches more
than 128 terms, {es} returns an error. You can use the
This prefix can expand to match at most `indices.query.bool.max_clause_count`
<<search-settings,search setting>> terms. If the prefix matches more terms,
{es} returns an error. You can use the
<<index-prefixes,`index-prefixes`>> option in the field mapping to avoid this
limit.

Expand All @@ -149,7 +152,8 @@ separate `analyzer` is specified.
==== `wildcard` rule parameters

The `wildcard` rule matches terms using a wildcard pattern. This pattern can
expand to match at most 128 terms. If the pattern matches more than 128 terms,
expand to match at most `indices.query.bool.max_clause_count`
<<search-settings,search setting>> terms. If the pattern matches more terms,
{es} returns an error.

`pattern`::
Expand Down Expand Up @@ -178,12 +182,45 @@ The `pattern` is normalized using the search analyzer from this field, unless
`analyzer` is specified separately.
--

[[intervals-regexp]]
==== `regexp` rule parameters

The `regexp` rule matches terms using a regular expression pattern.
This pattern can expand to match at most `indices.query.bool.max_clause_count`
<<search-settings,search setting>> terms.
If the pattern matches more terms,{es} returns an error.

`pattern`::
(Required, string) Regexp pattern used to find matching terms.
For a list of operators supported by the
`regexp` pattern, see <<regexp-syntax, Regular expression syntax>>.

WARNING: Avoid using wildcard patterns, such as `.*` or `.*?+``. This can
increase the iterations needed to find matching terms and slow search
performance.
--
`analyzer`::
(Optional, string) <<analysis, analyzer>> used to normalize the `pattern`.
Defaults to the top-level `<field>`'s analyzer.

--
`use_field`::
+
--
(Optional, string) If specified, match intervals from this field rather than the
top-level `<field>`.

The `pattern` is normalized using the search analyzer from this field, unless
`analyzer` is specified separately.
--

[[intervals-fuzzy]]
==== `fuzzy` rule parameters

The `fuzzy` rule matches terms that are similar to the provided term, within an
edit distance defined by <<fuzziness>>. If the fuzzy expansion matches more than
128 terms, {es} returns an error.
`indices.query.bool.max_clause_count`
<<search-settings,search setting>> terms, {es} returns an error.

`term`::
(Required, string) The term to match
Expand Down Expand Up @@ -214,6 +251,41 @@ The `term` is normalized using the search analyzer from this field, unless
`analyzer` is specified separately.
--

[[intervals-range]]
==== `range` rule parameters

The `range` rule matches terms contained within a provided range.
This range can expand to match at most `indices.query.bool.max_clause_count`
<<search-settings,search setting>> terms.
If the range matches more terms,{es} returns an error.

`gt`::
(Optional, string) Greater than: match terms greater than the provided term.

`gte`::
(Optional, string) Greater than or equal to: match terms greater than or
equal to the provided term.

`lt`::
(Optional, string) Less than: match terms less than the provided term.

`lte`::
(Optional, string) Less than or equal to: match terms less than or
equal to the provided term.

NOTE: It is required to provide one of `gt` or `gte` params.
It is required to provide one of `lt` or `lte` params.


`analyzer`::
(Optional, string) <<analysis, analyzer>> used to normalize the `pattern`.
Defaults to the top-level `<field>`'s analyzer.

`use_field`::
(Optional, string) If specified, match intervals from this field rather than the
top-level `<field>`.


[[intervals-all_of]]
==== `all_of` rule parameters

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/search/profile.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1298,7 +1298,7 @@ One of the `dfs.knn` sections for a shard looks like the following:
"query" : [
{
"type" : "DocAndScoreQuery",
"description" : "DocAndScore[100]",
"description" : "DocAndScoreQuery[0,...][0.008961825,...],0.008961825",
"time_in_nanos" : 444414,
"breakdown" : {
"set_min_competitive_score_count" : 0,
Expand Down
Loading
Loading