-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 2 missing timestamps to file
#445
Conversation
schemas/file.yml
Outdated
- name: accessed | ||
level: extended | ||
type: date | ||
description: Last time file content accessed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if a file was renamed? Does this count? Does this correlate with touch foo.txt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I guess we could detail which actions typically update which timestamp, on Windows and Linux platforms. But I'm sure there's edge cases.
Perhaps a more straightforward approach (and easier to maintain) would be to state which timestamps this maps to, for each platform family?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it different between platforms? Don't know much about the windows ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say this should align with the last access time that the different platforms report natively? I assume that's what users would expect.
Linux and macOS track access times by default and show them in the output of e.g. stat
(stat -x
on macOS). Windows no longer tracks access time by default for performance reasons, but it can be enabled.
Maybe the description should say Last time file was accessed.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've submitted a small tweak to the description, my initial sentence wasn't great :-)
But I feel like getting into the detail of what happens on renames, touch
& the like will not be helpful, and will be somewhat out of place.
I'd vote to leave as is, and update if there seems to be confusion / questions about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the part I'm more concerned is how this competes / conflicts for example with ctime
. It's super easy to add new fields but will be very hard to remove fields again. So if we add a field, we should be very clear what it's used for and how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I hadn't noticed this last comment, when I responded on the PR itself.
I'm not sure what you meant by "this" in "how this competes".
In #441, we're clarifying a quirk of ctime
vs mtime
, see this.
The cryptic crtime
often confuses people vs ctime
. This is why instead we're introducing created
, as the field for crtime
. So I think this is step 1 in clarifying these timestamps.
Step 2 of clarifying these timestamps will be getting rid of the already existing ctime
and mtime
. But as already discussed in my comment a few hours ago, this should be done separately. I want do have good support for deprecations.
Since we're improving the field descriptions over in #441 and introducing new ones here, let's pull them all together, to have a complete overview of the proposed changes:
- mtime: Last time the file content was modified.
- ctime: Last time the file attributes or metadata changed.
Note that changes to the file content will updatemtime
. This implies
ctime
will be adjusted at the same time, sincemtime
is an attribute
of the file. - created: File creation time.
Note that not all filesystems store the creation time. - accessed: Last time the file was accessed.
Note that not all filesystems keep track of access time.
Is there anything missing or unclear here?
If you're questioning whether they're worth adding at all, I do think so. Both are fundamental pieces of information with no overlap with the 2 timestamps we already have. The fact that not all filesystems / configurations record creation time or last access time for performance reasons is not a reason for not adding them, IMO. When they're available, I think they're useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the introduction of created. For ctime / accessed I see some potential conflicts but I'm ok moving forward if it's clear to everyone else. You mention in your PR description, that some should be deprecated. If there is overlap, we should directly deprecate the old fields when a new replacement one is introduced.
@ruflin About the deprecation: yes, I'm not deprecating anything for now. The affected fields would be the already existing ones, and I'm not touching them here. I agree that we will need to introduce the 2 new fields for them at the same time we introduce the deprecation of the old ones. I'm keeping this for later, because I'd like to investigate the possibility of having specific attributes & handling for deprecations (rather than just editing the field description). This support for deprecation is out of scope for this PR. |
This is take 2 of something started in #441. We're departing from the internal
consistency of the 4 Linux file timestamp names (atime, ctime, crtime, mtime),
and going for clear and readable names.
Right now this is only introducing the 2 missing fields,
created
andaccessed
.I think it would make sense to rename the other two eventually. But since this would be a breaking
change, we'd have to deprecate them for a while & document how to handle deprecations in ECS, and so on. This can be done as a separate PR.
Adding these 2 fields unapologetically should be straightforward, though :-)
Closes #422
Note for posterity
The first 2 timestamp fields for
file
-- introduced with ECS 1.0 -- were named in a very Posix-centric way. This PR departs from this approach. We are considering deprecating the initial two fields eventually, in favour of more readable ones.ctime
ctime
is often confused withcrtime
mtime
modified
?created
crtime
accessed
atime