Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Validators per field name #220

Closed
MikePaquette opened this issue Dec 4, 2018 · 3 comments
Closed

Suggestion: Validators per field name #220

MikePaquette opened this issue Dec 4, 2018 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@MikePaquette
Copy link
Contributor

Could we define optional “validators” per field name? Basically a REGEX that determines if the field values match expectations, and then record an error when validation fails?

No. 3 of 16. This question was asked by a new ECS user, who is familiar with mapping IT events to data models and use cases in other schemas. These questions are being posted as a GitHub issue, because a) they may offer valuable insights. b) we expect that many new users will have similar questions.

@MikePaquette
Copy link
Contributor Author

This seems like a good idea.
Producing an ECS-compliant events may require one or more transformations which can be performed at the shipper, ETL, or Ingest node. With validation Regex, it would be possible to implement validators at one or more of these transformation points.

@webmat
Copy link
Contributor

webmat commented Dec 10, 2018

I've been thinking about this problem quite a bit. We certainly could implement a validator that's meant to be put inside the pipeline, and either 1) add details about problems observed to the events, then let them through, or 2) reject the events altogether, if the user prefers this.

Although with the amount of fields in ECS, performing these checks at ingestion time may be very expensive.

The approach I've been thinking about is rather to have an "evaluator" tool -- probably in Kibana -- that can be launched at an index pattern, to evaluate it's compliance with ECS. Here are some ideas of what this tool could detect:

  • How many deal breakers there are, that would result in a field mapping exception (e.g. an object field should be a string)
  • how many non-ECS fields (not a big problem, but useful to review if they have an ECS equivalent)
  • how many conforming ECS fields in terms of name & type
  • how many fields contain data in the expected format as well (e.g. are lowercased fields actually lowercase?)
  • are any reserved field names being used?

This would be a great approach for preparing for a migration to ECS, index pattern per index pattern. In the future, this tool could also help evaluate the migration from ECS 1.0 to the next version. Finally, this tool would be useful at all times, to guide the work on any new data source.

This approach does not help prevent "bad data" from getting into an index that is meant to conform to ECS. However one of the tenets of ECS is that customers should be able to add fields "around" the official ECS fields, to fully capture their use case. So having "non-ECS fields" is not a problem per se. I'm not sure I would advocate that anyone enforce strict adherence to ECS, at least not in the general case.

@ebeahan ebeahan self-assigned this Nov 16, 2021
@ebeahan
Copy link
Member

ebeahan commented Nov 16, 2021

Closing out while revisiting issues that haven't been active for some time. However, I also wanted to leave some current context behind the state of validating the "ECS-ness" of events.

The approach I've been thinking about is rather to have an "evaluator" tool -- probably in Kibana -- that can be launched at an index pattern, to evaluate it's compliance with ECS.

Checking an index or document to ensure it's ECS-compliant is still a sound, valuable idea that the team is tracking. Expect an updated issue with a more detailed plan coming soon! 😄

I'll also note there now other options:

  • A continually growing number of pre-built Elastic integrations that produce ECS-complaint events "out-of-the-box".
  • Features added (since this issue was opened) to the ECS tooling to help users manage their index templates: including custom fields, limiting fields, or leveraging a specific ECS version.
  • The idea of developing custom detection rules with the Elastic security solution to detection ECS non-compliant data: https://www.elastic.co/blog/validating-elastic-common-schema-fields-using-detection-rules.

@ebeahan ebeahan closed this as completed Nov 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants