-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event categorisation fields #242
Conversation
fields.yml
Outdated
|
||
This gives information about what type of information the event | ||
contains, without being specific to the contents of the event. Examples | ||
are `event`, `state`, `alarm`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of event
, perhaps we should use log
here? For me all 3 kind are an event
(based on ECS).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cwurm Also mentioned action
here as an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log
is not great, btw, in Auditbeat we have events like "process start" that we get from a netlink socket, not from a "log".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree "event" is better than "log" for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the definition should be a bit more forceful in that this field must contain one of these values. The goal here is super broad categorization of how this document/event came about. So the values have to be predictable:
- event: just reporting something that was observed in real-time (e.g. syslog message, log entry, event on a bus such as ebpf).
- state: generating an update on the state of something on a regular schedule (list of running processes, installed packages, current CPU usage).
- So for example, metrics from metricbeat should all use
event.kind:state
.
- So for example, metrics from metricbeat should all use
- alarm: alerting on the fact that something unwanted happened, or a state changed enough to warrant attention from someone. Could be triggered by a static threshold, ML, a rules engine, etc.
What we want to avoid here is to have people map arbitrary values from their event sources to this field, and we end up with lots of unpredictable values in event.kind
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course the corollary is that if people determine that their stuff doesn't fall into one of these 3, we want them to open a GH issue to discuss it, and potentially grow that list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a warning like for the others. I'm not sure we want to define event
, state
, alarm
, etc just yet, there will be more than that and we'd do a poor job if we try to do it by the freeze date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the defined list of values later on should go into a different document and potentially we have 1 doc per use case and the values for each use case might be different.
log
for me does not mean it must come from a file. I'm more thinking of the historical definition an official record of events during the voyage of a ship or aircraft
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cwurm Also mentioned
action
here as an option.
I think it was state_restart
. I wanted a way to distinguish between a state that is triggered periodically (e.g. every 1h) and one triggered by a restart of (in this case) Auditbeat. One could imagine other reasons it was sent (e.g. a threshold was breached, or that too many changes accumulated). If we had something like an event.trigger
field that captures why a document was sent that would also work - but maybe that's one field too many as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all, I'd like to get this in.
However I think we need to clearly warn people that a few of these fields will soon have prescribed content, therefore not to use them (or use at their own risk). event.type
is pretty clear on this already. Here are the other ones:
event.kind
event.category
event.outcome
If we don't do this and people start using them widely, then suddenly coming up with the list of prescribed values is a breaking change.
.gitignore
Outdated
@@ -1,3 +1,4 @@ | |||
.DS_Store | |||
*.pyc | |||
env | |||
*.swp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL I have this in my global gitignore, so it never came up.
Please make it *.sw?
, though 🙏🏼
fields.yml
Outdated
|
||
This gives information about what type of information the event | ||
contains, without being specific to the contents of the event. Examples | ||
are `event`, `state`, `alarm`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree "event" is better than "log" for this.
fields.yml
Outdated
|
||
This gives information about what type of information the event | ||
contains, without being specific to the contents of the event. Examples | ||
are `event`, `state`, `alarm`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the definition should be a bit more forceful in that this field must contain one of these values. The goal here is super broad categorization of how this document/event came about. So the values have to be predictable:
- event: just reporting something that was observed in real-time (e.g. syslog message, log entry, event on a bus such as ebpf).
- state: generating an update on the state of something on a regular schedule (list of running processes, installed packages, current CPU usage).
- So for example, metrics from metricbeat should all use
event.kind:state
.
- So for example, metrics from metricbeat should all use
- alarm: alerting on the fact that something unwanted happened, or a state changed enough to warrant attention from someone. Could be triggered by a static threshold, ML, a rules engine, etc.
What we want to avoid here is to have people map arbitrary values from their event sources to this field, and we end up with lots of unpredictable values in event.kind
.
fields.yml
Outdated
|
||
This gives information about what type of information the event | ||
contains, without being specific to the contents of the event. Examples | ||
are `event`, `state`, `alarm`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course the corollary is that if people determine that their stuff doesn't fall into one of these 3, we want them to open a GH issue to discuss it, and potentially grow that list.
This contains high-level information about the contents of the event. It | ||
is more generic than `event.action`, in the sense that typically a | ||
category contains multiple actions. | ||
example: user-management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan with this field is to have a predetermined set of categories (a few tens of them). Having the field now in ECS sets us up for a sort of breaking change when we come up with that list.
Therefore I think the description can remain as is, but we should add a warning telling people that they're using this field at their own risk, because it will soon be a field that should be populated with prescribed values.
It won't be the end of the world, though. It's not going to break ingestion or anything. It's just that Elastic solutions will expect this field to contain certain values, so if some sources populate this differently, they won't have the best experience...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning added.
|
||
- name: action | ||
- name: outcome |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outcome
is also meant to be a field with prescribed content. We should add a similar warning as what I suggest for event.category
. More or less "We're about to come up with a list of prescribed values, so use with caution."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning added.
3161db9
to
5d372fb
Compare
@webmat I think I addressed your comments, please have a look again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please wait for approval by @MikePaquette and @ruflin before merging. This is a big one. |
fields.yml
Outdated
@@ -396,35 +396,64 @@ | |||
Unique ID to describe the event. | |||
example: 8a4f500d | |||
|
|||
- name: kind | |||
level: core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, only spotted now but these fields should go into extended for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved event.kind
and event.outcome
to extended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be proposing pushing these to core sometime in the future :-)
@tsg I think you need to run |
@ruflin, thanks, I pushed again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR is a minimal change to the current ECS fields that allows us to go in the planned path for categorization without requiring breaking changes later on. The hope is that we'll get this in for Beta2. It does the following:
event.action
andevent.category
to be closer to what we are planning, but keeps their definition fairly abstract. The docs don't go as far as showing a list of examples.event.type
definition to be only reserved for now.event.kind
andevent.outcome
, which are fairly non-contentious. I can remove them from the PR if needed to get this in before Beta2.I plan to follow up with a more complete PR that adds more details.