Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Documents that embeddings are filtered out from semantic_text results #113701

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions docs/reference/mapping/fields/source-field.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,97 @@ GET logs/_search

<1> These fields will be removed from the stored `_source` field.
<2> We can still search on this field, even though it is not in the stored `_source`.


[[filter-vectors]]
==== Filtering vectors from `_source`

The `include_vectors` parameter enables you to include or exclude embeddings for `sparse_vector`, `dense_vector`, and `semantic_text` fields from `_source` when it's displayed in API results.

The default behavior varies by field typse:

* `sparse_vector` and `dense_vector` fields are included by default if `include_vectors` is not set to `false`
* `semantic_text` fields are excluded by default if `include_vectors` is not set to `true`
* `update_by_query` and `reindex` operations on `semantic_text` fields are included by default if `include_vectors` is not set to `false`


The `include_vectors` parameter can be used as follows:

[source,console]
--------------------------------------------------
GET my-index-000001/_search
{
"_source": { "include_vectors": true },
"query": {
"match": {
"content": "Test Data"
}
}
}
--------------------------------------------------
// TEST[skip: TBD]

szabosteve marked this conversation as resolved.
Show resolved Hide resolved
When the parameter is `true`, the results will show the vector fields in the results:

[source,console-result]
--------------------------------------------------
"_source": {
"data_value": 15,
"semantic_text_field": {
"text": "Test Data",
"inference": {
"inference_id": "my-elser-model",
"model_settings": {
"task_type": "sparse_embedding"
},
"chunks": [
{
"text": "Test Data",
"embeddings": {
"test": 2.7982168,
"data": 2.5768325,
"testing": 1.6320102,
"(...)",
}
}
]
}
},
"sparse_vector": {
"Test": 1,
"Data": 2
},
"content": "Test Data",
"dense_vector": [
0.5,
10
]
}
--------------------------------------------------
// NOTCONSOLE

When the parameter is `false`, the results won't show the vector fields in the results:

[source,console-result]
--------------------------------------------------
(...)
"_source": {
"data_value": 15,
"semantic_text_field": {
"text": "Test Data",
"inference": {
"inference_id": "my-elser-model",
"model_settings": {
"task_type": "sparse_embedding"
},
"chunks": [
{
"text": "Test Data"
}
]
}
},
"content": "Test Data"
}
--------------------------------------------------
// NOTCONSOLE
5 changes: 1 addition & 4 deletions docs/reference/mapping/types/semantic-text.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -114,10 +114,7 @@ Once a document is ingested, a `semantic_text` field will have the following str
},
"chunks": [ <4>
{
"text": "these are not the droids you're looking for",
"embeddings": {
(...)
}
"text": "these are not the droids you're looking for"
}
]
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -201,10 +201,7 @@ query from the `semantic-embedding` index:
},
"chunks": [
{
"text": "There are a few foods and food groups that will help to fight inflammation and delayed onset muscle soreness (both things that are inevitable after a long, hard workout) when you incorporate them into your postworkout eats, whether immediately after your run or at a meal later in the day. Advertisement. Advertisement.",
"embeddings": {
(...)
}
"text": "There are a few foods and food groups that will help to fight inflammation and delayed onset muscle soreness (both things that are inevitable after a long, hard workout) when you incorporate them into your postworkout eats, whether immediately after your run or at a meal later in the day. Advertisement. Advertisement."
}
]
}
Expand All @@ -226,10 +223,7 @@ query from the `semantic-embedding` index:
},
"chunks": [
{
"text": "During Your Workout. There are a few things you can do during your workout to help prevent muscle injury and soreness. According to personal trainer and writer for Iron Magazine, Marc David, doing warm-ups and cool-downs between sets can help keep muscle soreness to a minimum.",
"embeddings": {
(...)
}
"text": "During Your Workout. There are a few things you can do during your workout to help prevent muscle injury and soreness. According to personal trainer and writer for Iron Magazine, Marc David, doing warm-ups and cool-downs between sets can help keep muscle soreness to a minimum."
}
]
}
Expand All @@ -251,10 +245,7 @@ query from the `semantic-embedding` index:
},
"chunks": [
{
"text": "This is especially important if the soreness is due to a weightlifting routine. For this time period, do not exert more than around 50% of the level of effort (weight, distance and speed) that caused the muscle groups to be sore.",
"embeddings": {
(...)
}
"text": "This is especially important if the soreness is due to a weightlifting routine. For this time period, do not exert more than around 50% of the level of effort (weight, distance and speed) that caused the muscle groups to be sore."
}
]
}
Expand Down