-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support in for custom analyzers for text and overriding the default template settings for index templates #1737
Conversation
@ebeahan @djptek I added support for users to add in their search I went with choice 1, but it is definitely the most complex. Do you both agree with that approach, or is there something else I should do / am not thinking of? |
Would you provide some more details? I'm uncertain what exactly needs handling here. If the user wants to declare a custom analyzer on a field, the index settings specify an analyzer. But if the user doesn't define an analyzer, that settings would be omitted, right? |
I just wasn't sure how much 'control' ECS wanted to exert here. They 'key' part IMO was passing the analyzer along with the field, which we do. It's just a matter of how we handle the settings piece of the analyzer and if we want to add it for them or not or whatever.
Correct. Essentially I was trying to figure out how we handle that definition of the analyzer in the settings, because we have a handful of directions we can go in. I went with adding an empty definition so that the user can fill it in with whatever definition they see fit. But of course if there are no analyzers, nothing will be added. On the other end of the spectrum, we could add nothing into the settings and let the user populate it themselves. Does that help? |
I think I better understand. IMO, our tooling should support passing through whatever settings the user provides. I'll walk through a hypothetical scenario using this feature. Share if you see this coming together differently. Using the {
"order": 1,
"index_patterns": [
"your-ecs-*"
],
"settings": {
"index": {
"mapping": {
"total_fields": {
"limit": "10000"
}
},
"analysis": {
"analyzer": {
"domain_analyzer": {
"type": "pattern",
"pattern": "\\."
}
}
},
"refresh_interval": "5s"
}
}
} The user also declares the analyzer on their field definitions. Since we've moved all text fields to - name: acme
title: ACME
description: >
Acme Inc. custom fields
type: group
fields:
- name: account.domain
type: text
level: custom
analzyer: domain_analyzer
description: >
Customer account description for this activity. If a user wants a built-in analyzer, they can skip the first step and define it on the field: fields:
- name: account.domain
type: text
level: custom
analyzer: whitespace
description: >
Customer account description for this activity. User runs the generator, and the (v1) template is created: {
"index_patterns": [
"your-ecs-*"
],
"mappings": {
"date_detection": false,
"dynamic_templates": [
{
"strings_as_keyword": {
"mapping": {
"ignore_above": 1024,
"type": "keyword"
},
"match_mapping_type": "string"
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"acme": {
"properties": {
"account": {
"properties": {
"domain": {
"analyzer": "domain_analyzer",
"type": "text"
},
"id": {
"ignore_above": 1024,
"type": "keyword"
}
}
}
}
}
}
},
"order": 1,
"settings": {
"index": {
"analysis": {
"analyzer": {
"domain_analyzer": {
"pattern": "\\.",
"type": "pattern"
}
}
},
"mapping": {
"total_fields": {
"limit": "10000"
}
},
"refresh_interval": "5s"
}
}
} If the resulting template has a config error, I think it's okay to rely on ES to throw an exception when it's uploaded. If I typoed {
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "analyzer [domain_analyzer] must specify either an analyzer type, or a tokenizer"
}
],
"type": "illegal_argument_exception",
"reason": "analyzer [domain_analyzer] must specify either an analyzer type, or a tokenizer"
},
"status": 400
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generator implementation looks good. Do you see any opportunities to cover the different template generations in the unit tests?
I had some thoughts about the proposed naming.
I love your suggestions for updated naming, I just updated all of that. As far as unit tests, I did not. But I will take a closer look. |
@ebeahan tests added! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the USAGE.md
suggestion, everything else is a nit or something we can handle outside this PR.
@@ -0,0 +1,16 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll handle it in a separate PR, but the usage-example
generated files should be updated to match the new dir structure.
@@ -186,7 +166,7 @@ def entry_for(field): | |||
if mf_type == 'keyword': | |||
ecs_helpers.dict_copy_existing_keys(mf, mf_entry, ['normalizer', 'ignore_above']) | |||
elif mf_type == 'text': | |||
ecs_helpers.dict_copy_existing_keys(mf, mf_entry, ['norms']) | |||
ecs_helpers.dict_copy_existing_keys(mf, mf_entry, ['norms', 'analyzer']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out-of-scope here, just leaving for future consideration:
#1264 only proposes adding analyzer
support to multi-fields, but, in the near future, I think we could also support passing analyzer
through for all text
fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me
Co-authored-by: Eric Beahan <[email protected]>
…fault template settings for index templates (elastic#1737)
…fault template settings for index templates (elastic#1737) # Conflicts: # scripts/generators/es_template.py
…fault template settings for index templates (elastic#1737) # Conflicts: # CHANGELOG.next.md # scripts/generators/es_template.py
Resolves #1264
Resolves #1203
This PR encompasses two smaller tooling improvements:
Change 2 led to a change in the arguments for the generator. Instead of just
--template-settings
, we will now have--template-settings-legacy
and--template-settings
. This will be a breaking change since--template-settings
will now refer to v2 index templates.