-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additions to "file" #441
Additions to "file" #441
Conversation
BOOM! |
schemas/file.yml
Outdated
Only relevant when `file.type` is "file". | ||
example: 16384 | ||
|
||
- name: crtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty cryptic for ECS standard I would say? What about created
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could also be creationtime
to keep in line with the *time
naming schema we have going on.
The stat
output calls it Birth
(though always empty on Linux since it does not expose the creation time) - but I don't think that's a good name.
It's maybe unfortunate we already have ctime
and mtime
which are UNIX names that Windows does not have. ctime
could be confused as the creation time. If we ever want to make a change we could go for more descriptive names, e.g. created|modified|changed|accessed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was specifically pondering this, as I was working on this PR.
The file
field set didn't get much attention before 1.0.0, unfortunately. But I'm not a fan of the existing field names ctime
and mtime
, because they go against one of the principles we're aiming for with ECS. We try to avoid contractions and acronyms, unless the convention is overwhelmingly in favour of them (e.g. ip
, pid
).
Seeing that we already had both of these fields, I went with the other 2 standard field names. I went for internal consistency in file
. But this is getting on my nerves even more, as we discuss this 😂
As @cwurm points out, ctime
vs crtime
are confusing. Any article about Linux filesystem timestamps (debugfs, stat) mention this common confusion.
So I'm very open to revisiting these field names. It's too late for now to change ctime
and mtime
, although we could create 4 new fields with less ambiguous names and mark the first two as deprecated for the life of ECS 1.x, and remove them in 2.0.
Proposed names:
file.created
: creation timefile.accessed
: last access timefile.modified
(deprecatesfile.mtime
): last modification time for the file contentfile.metadata_modified
(deprecatesfile.ctime
): last modification time for the file metadata.
I'm not a big fan of this last one. Ideas welcome.
For now we can push this naming decision to a bit later, focus on the two new fields, and make a note to deprecate mtime/ctime.
schemas/file.yml
Outdated
|
||
- name: device | ||
level: extended | ||
type: keyword | ||
description: Device that is the source of the file. | ||
example: sda | ||
|
||
- name: system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this belongs here. file
is information about a specific file. I see how it could fit it in but stumbled over it.
The other part I would not call it system
as I expect us to have a system object in the future related to system metrics. system
already exists today in Metricbeat. Alternative names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file.filesystem
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty strongly in favour of adding this field even though it's not information directly about a given file. It's still useful metadata about the place where a file lives.
I wanted to point to host
as very general metadata about an event, but host
is a field set, not a field inside one specific field set.
Perhaps we could do the same with filesystem information, and introduce a volume
field set?
Volume fields:
- name
- filesystem
- size (bytes)
- used (bytes)
- letter (Windows drive letter)
- kind (ssd, spinning rust)
- type (platform specific types, such as AWS' gp2, io1 etc)
- id
- serial_number
...
schemas/file.yml
Outdated
description: > | ||
Last time the file attributes or metadata changed. | ||
|
||
- name: atime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess these are the "standard" names that is why we use them here?
For now I'll take out I'll open a PR right away for the 2 timestamps with clearer names. I'd like to introduce the other 2 timestamps sooner than later and deprecate the two original ones. But this is less pressing. We just need to introduce the deprecation well in advance of ECS 2.0, which is quite a ways off in the future :-) I'll make a note for the |
Thanks for the feedback! I've taken out the new timestamps as well as Please have a quick look, to see if this PR is ready. New, more readable timestamps proposed in #445, and #444 opens the discussion for |
schemas/file.yml
Outdated
|
||
- name: ctime | ||
level: extended | ||
type: date | ||
description: Last time file metadata changed. | ||
description: Last time the file attributes or metadata changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Linux and macOS, ctime
will also be updated when the file contents change (together with mtime
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a documentation tweak about this. I find it my edit a bit clumsy, and I'm not 100% convinced this belongs in ECS.
In a sense this helps explain the relationship between two ECS fields, so I'm ok with it going in.
On the other hand, I feel like we're getting too close to documenting OS quirks & the like, which can be different from one platform to the next.
But in this case I'm ok with this addition. I'll merge tomorrow, unless there's disagreement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cwurm Any other objections about this PR? Does this documentation update work for you?
It's not the best prose, but we can always tweak improve it later if needed :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cwurm Any other objections about this PR? Does this documentation update work for you?
Yes, LGTM
On the other hand, I feel like we're getting too close to documenting OS quirks & the like, which can be different from one platform to the next.
True, though I'm not sure we can avoid it entirely. Only UNIX has atime/mtime/ctime/crtime, Windows has only lpCreationTime/lpLastAccessTime/lpLastWriteTime (and lpLastAccessTime is not tracked by default).
So for Windows then, is lpLastWriteTime == mtime
(modified
in a future version?), with ctime
empty (later changed
?)? Or is lpLastWriteTime == mtime == ctime
? The Auditbeat FIM module must be implementing this in some way, but I don't know how off the top of my head.
We had a situation where - mtime was the Content change, and - ctime was the Metadata change. This adjustment's goal is to avoid having people remember these fields with an incorrect mnemonic, where mtime=Metadata & ctime=Content. This is why I'm downplaying the usage of the word "metadata".
This reverts commit 117691c.
This reverts commit dc6b3c5.
This reverts commit 1022e82.
This reverts commit f122f6b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@webmat General recommendation: Don't mix different things into PR which will get them merged much more quickly. The additions to file belong together, but all the changes to the docs I would consider not related. If there would have been 2 PR's the additional fields could have gone in much more quickly as an example.
@ruflin Agreed. I was just trying to streamline my time, before the month of May, where I have 3/4 weeks not on ECS :-) |
This PR addresses/improves a few things in the the "file" field set.
If any of these turns out controversial / complicated, I'll be happy to extract them to a separate PR. The goal here is to get the simple things in.
file.name
(See add file.name #322)file.directory
Closes #322