Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding component template substitutions to the simulate ingest API #113276

Conversation

masseyke
Copy link
Member

@masseyke masseyke commented Sep 20, 2024

This adds the ability to pass substitute component template definitions to the simulate ingest API. These definitions can be used to change the pipeline(s) used for ingest, or to change the mappings used for validation. Here is an example that shows both of those usages working together:

## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": true
      }
    }
  ]
}

PUT /_ingest/pipeline/bar-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "bar",
        "value": true
      }
    }
  ]
}

## Now, create two component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The other provides a setting that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
  "template": {
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "foo": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT _component_template/settings_template
{
  "template": {
    "settings": {
      "index": {
        "default_pipeline": "foo-pipeline"
      }
    }
  }
}

## Here we create an index template  pulling in both of the component templates above
PUT _index_template/template_1
{
  "index_patterns": ["foo*"],
  "composed_of": ["mappings_template", "settings_template"]
}

## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
  "foo": "FOO"
}

## This will not blow up with validation exceptions because the substitute "mappings_template" sets
## dynamic to true so that the introduction of the "bar" field doesn't cause problems.
## And the bar-pipeline is executed rather than the foo-pipeline because of the substitute 
## "settings_template", so the value of "foo" does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
  "docs": [
    {
      "_id": "asdf",
      "_source": {
        "foo": "foo",
        "bar": "bar"
      }
    }
  ],
  "component_template_substitutions": {
    "mappings_template": {
      "template": {
        "mappings": {
          "dynamic": "strict",
          "properties": {
            "foo": {
              "type": "keyword"
            },
            "bar": {
              "type": "boolean"
            }
          }
        }
      }
    },
    "settings_template": {
      "template": {
        "settings": {
          "index": {
            "default_pipeline": "bar-pipeline"
          }
        }
      }
    }
  }
}

This builds on the work done in #112957 and #113063.

@masseyke masseyke added >enhancement :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.16.0 v9.0.0 labels Sep 20, 2024
Copy link
Contributor

Documentation preview:

@elasticsearchmachine
Copy link
Collaborator

Hi @masseyke, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Sep 20, 2024
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@masseyke masseyke merged commit cd950bb into elastic:main Sep 25, 2024
15 checks passed
@masseyke masseyke deleted the component-template-substitutions-for-simulate-ingest branch September 25, 2024 20:30
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

masseyke added a commit to masseyke/elasticsearch that referenced this pull request Sep 25, 2024
elasticsearchmachine pushed a commit that referenced this pull request Oct 8, 2024
This adds support for a new `index_template_substitutions` field to the
body of an ingest simulate API request. These substitutions can be used
to change the pipeline(s) used for ingest, or to change the mappings
used for validation. It is similar to the
`component_template_substitutions` added in #113276. Here is an example
that shows both of those usages working together:

```
## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": true
      }
    }
  ]
}

PUT /_ingest/pipeline/bar-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "bar",
        "value": true
      }
    }
  ]
}

## Now, create three component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting
## that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
  "template": {
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "foo": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT _component_template/mappings_template_with_bar
{
    "template": {
      "mappings": {
        "dynamic": "strict",
        "properties": {
          "foo": {
            "type": "keyword"
          },
          "bar": {
            "type": "boolean"
          }
        }
      }
    }
}

PUT _component_template/settings_template
{
  "template": {
    "settings": {
      "index": {
        "default_pipeline": "foo-pipeline"
      }
    }
  }
}

## Here we create an index template  pulling in both of the component templates above
PUT _index_template/template_1
{
  "index_patterns": ["foo*"],
  "composed_of": ["mappings_template", "settings_template"]
}

## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
  "foo": "FOO"
}

## This will not blow up with validation exceptions because the substitute "index_template_substitutions"
## uses `mappings_template_with_bar`, which adds the bar field.
## And the bar-pipeline is executed rather than the foo-pipeline because the substitute
## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo"
## does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
  "docs": [
    {
      "_id": "asdf",
      "_source": {
        "foo": "foo",
        "bar": "bar"
      }
    }
  ],
  "component_template_substitutions": {
    "settings_template": {
      "template": {
        "settings": {
          "index": {
            "default_pipeline": "bar-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "template_1": {
      "index_patterns": ["foo*"],
      "composed_of": ["mappings_template_with_bar", "settings_template"]
    }
  }
}
```
masseyke added a commit to masseyke/elasticsearch that referenced this pull request Oct 8, 2024
…ic#114128)

This adds support for a new `index_template_substitutions` field to the
body of an ingest simulate API request. These substitutions can be used
to change the pipeline(s) used for ingest, or to change the mappings
used for validation. It is similar to the
`component_template_substitutions` added in elastic#113276. Here is an example
that shows both of those usages working together:

```
## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": true
      }
    }
  ]
}

PUT /_ingest/pipeline/bar-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "bar",
        "value": true
      }
    }
  ]
}

## Now, create three component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting
## that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
  "template": {
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "foo": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT _component_template/mappings_template_with_bar
{
    "template": {
      "mappings": {
        "dynamic": "strict",
        "properties": {
          "foo": {
            "type": "keyword"
          },
          "bar": {
            "type": "boolean"
          }
        }
      }
    }
}

PUT _component_template/settings_template
{
  "template": {
    "settings": {
      "index": {
        "default_pipeline": "foo-pipeline"
      }
    }
  }
}

## Here we create an index template  pulling in both of the component templates above
PUT _index_template/template_1
{
  "index_patterns": ["foo*"],
  "composed_of": ["mappings_template", "settings_template"]
}

## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
  "foo": "FOO"
}

## This will not blow up with validation exceptions because the substitute "index_template_substitutions"
## uses `mappings_template_with_bar`, which adds the bar field.
## And the bar-pipeline is executed rather than the foo-pipeline because the substitute
## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo"
## does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
  "docs": [
    {
      "_id": "asdf",
      "_source": {
        "foo": "foo",
        "bar": "bar"
      }
    }
  ],
  "component_template_substitutions": {
    "settings_template": {
      "template": {
        "settings": {
          "index": {
            "default_pipeline": "bar-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "template_1": {
      "index_patterns": ["foo*"],
      "composed_of": ["mappings_template_with_bar", "settings_template"]
    }
  }
}
```
elasticsearchmachine pushed a commit that referenced this pull request Oct 9, 2024
…) (#114374)

This adds support for a new `index_template_substitutions` field to the
body of an ingest simulate API request. These substitutions can be used
to change the pipeline(s) used for ingest, or to change the mappings
used for validation. It is similar to the
`component_template_substitutions` added in #113276. Here is an example
that shows both of those usages working together:

```
## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": true
      }
    }
  ]
}

PUT /_ingest/pipeline/bar-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "bar",
        "value": true
      }
    }
  ]
}

## Now, create three component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting
## that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
  "template": {
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "foo": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT _component_template/mappings_template_with_bar
{
    "template": {
      "mappings": {
        "dynamic": "strict",
        "properties": {
          "foo": {
            "type": "keyword"
          },
          "bar": {
            "type": "boolean"
          }
        }
      }
    }
}

PUT _component_template/settings_template
{
  "template": {
    "settings": {
      "index": {
        "default_pipeline": "foo-pipeline"
      }
    }
  }
}

## Here we create an index template  pulling in both of the component templates above
PUT _index_template/template_1
{
  "index_patterns": ["foo*"],
  "composed_of": ["mappings_template", "settings_template"]
}

## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
  "foo": "FOO"
}

## This will not blow up with validation exceptions because the substitute "index_template_substitutions"
## uses `mappings_template_with_bar`, which adds the bar field.
## And the bar-pipeline is executed rather than the foo-pipeline because the substitute
## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo"
## does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
  "docs": [
    {
      "_id": "asdf",
      "_source": {
        "foo": "foo",
        "bar": "bar"
      }
    }
  ],
  "component_template_substitutions": {
    "settings_template": {
      "template": {
        "settings": {
          "index": {
            "default_pipeline": "bar-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "template_1": {
      "index_patterns": ["foo*"],
      "composed_of": ["mappings_template_with_bar", "settings_template"]
    }
  }
}
```
matthewabbott pushed a commit to matthewabbott/elasticsearch that referenced this pull request Oct 10, 2024
…ic#114128)

This adds support for a new `index_template_substitutions` field to the
body of an ingest simulate API request. These substitutions can be used
to change the pipeline(s) used for ingest, or to change the mappings
used for validation. It is similar to the
`component_template_substitutions` added in elastic#113276. Here is an example
that shows both of those usages working together:

```
## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": true
      }
    }
  ]
}

PUT /_ingest/pipeline/bar-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "bar",
        "value": true
      }
    }
  ]
}

## Now, create three component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting
## that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
  "template": {
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "foo": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT _component_template/mappings_template_with_bar
{
    "template": {
      "mappings": {
        "dynamic": "strict",
        "properties": {
          "foo": {
            "type": "keyword"
          },
          "bar": {
            "type": "boolean"
          }
        }
      }
    }
}

PUT _component_template/settings_template
{
  "template": {
    "settings": {
      "index": {
        "default_pipeline": "foo-pipeline"
      }
    }
  }
}

## Here we create an index template  pulling in both of the component templates above
PUT _index_template/template_1
{
  "index_patterns": ["foo*"],
  "composed_of": ["mappings_template", "settings_template"]
}

## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
  "foo": "FOO"
}

## This will not blow up with validation exceptions because the substitute "index_template_substitutions"
## uses `mappings_template_with_bar`, which adds the bar field.
## And the bar-pipeline is executed rather than the foo-pipeline because the substitute
## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo"
## does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
  "docs": [
    {
      "_id": "asdf",
      "_source": {
        "foo": "foo",
        "bar": "bar"
      }
    }
  ],
  "component_template_substitutions": {
    "settings_template": {
      "template": {
        "settings": {
          "index": {
            "default_pipeline": "bar-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "template_1": {
      "index_patterns": ["foo*"],
      "composed_of": ["mappings_template_with_bar", "settings_template"]
    }
  }
}
```
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Oct 13, 2024
…ic#114128)

This adds support for a new `index_template_substitutions` field to the
body of an ingest simulate API request. These substitutions can be used
to change the pipeline(s) used for ingest, or to change the mappings
used for validation. It is similar to the
`component_template_substitutions` added in elastic#113276. Here is an example
that shows both of those usages working together:

```
## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": true
      }
    }
  ]
}

PUT /_ingest/pipeline/bar-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "bar",
        "value": true
      }
    }
  ]
}

## Now, create three component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting
## that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
  "template": {
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "foo": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT _component_template/mappings_template_with_bar
{
    "template": {
      "mappings": {
        "dynamic": "strict",
        "properties": {
          "foo": {
            "type": "keyword"
          },
          "bar": {
            "type": "boolean"
          }
        }
      }
    }
}

PUT _component_template/settings_template
{
  "template": {
    "settings": {
      "index": {
        "default_pipeline": "foo-pipeline"
      }
    }
  }
}

## Here we create an index template  pulling in both of the component templates above
PUT _index_template/template_1
{
  "index_patterns": ["foo*"],
  "composed_of": ["mappings_template", "settings_template"]
}

## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
  "foo": "FOO"
}

## This will not blow up with validation exceptions because the substitute "index_template_substitutions"
## uses `mappings_template_with_bar`, which adds the bar field.
## And the bar-pipeline is executed rather than the foo-pipeline because the substitute
## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo"
## does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
  "docs": [
    {
      "_id": "asdf",
      "_source": {
        "foo": "foo",
        "bar": "bar"
      }
    }
  ],
  "component_template_substitutions": {
    "settings_template": {
      "template": {
        "settings": {
          "index": {
            "default_pipeline": "bar-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "template_1": {
      "index_patterns": ["foo*"],
      "composed_of": ["mappings_template_with_bar", "settings_template"]
    }
  }
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants