Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extended_version to package definition #586

Merged
merged 4 commits into from
Nov 18, 2019

Conversation

simitt
Copy link
Contributor

@simitt simitt commented Oct 16, 2019

Allows to store additional version information for an installed
package.

related to #515

Follow up from #532 to focus on adding an additional version information.


Summary of previous discussions around this field:

#532 (review)

@andrewkroh:
Do you think there should be a release field for release number used in Redhat packaging? Or would that be too specific to one packaging type.

#532 (comment)

@webmat:
@andrewkroh Can you give an example of such a release string?

Is this meant to support their extra long version strings, where they mix the original package version + their backporting patches on top?

#532 (comment)

@andrewkroh:
2.6.32-754.el6

In this example 2.3.6 is the package version. and 754.el6 is the release. The version is generally the upstream software version (like the linux kernel version) and the release is the distribution specific value that gets rev'ed when a patch is backported.

#532 (comment)

@cwurm:
I personally think it would be too specific. Only RPM has it. DPKG has a similar one though, called revision (not parsed out by Auditbeat, but libpdkg can do it).

My preference would be to keep ECS simple and not include it at all, with users always welcome to add extra fields that are useful to them (but are not required or even expected). But if we want to have it, I'd rather have something that covers RPM, DPKG, and anything else (and maybe that's release, revision, or something else generic).

#532 (comment)

@webmat:
On the distro package versions, do rpm and dpkg give easy access to the origin/upstream version number as well? In other words, to get the 2.6.32 in Andrew's example, do we need to parse it out of the release version? Or is there another attribute that contains 2.6.32?

If extra work is required to parse out the origin version number, it will get nasty real quick. E.g. what if some origin packages contain - in their version numbers? I wouldn't be surprised at all 😂

So I'm wondering if we need to specify this at all, at this time. Perhaps just having package.version without specifics is enough for now? Platforms that have full release versions will get that in package.version, and platforms that only have "simple" versions will get that in package.version...

If on the other hand we can get the origin version easily from rpm/dpkg, I'd be open to discuss the addition of a second field, for distro's release version as well.

#532 (comment)

@simitt:
@andrewkroh have you seen the suggested detailed_version? I originally suggested it with the intention of having fine granular information for unreleased versions, but maybe this could serve both purposes.

#532 (comment)

@andrewkroh:
I think detailed_version could work for RHEL packaging assuming we document it should be formatted as %{VERSION}-%{RELEASE}. Like the value of rpm -qp --queryformat '%{VERSION}-%{RELEASE}' example.rpm. Then since we have a separate field for architecture I think this would cover the RPM details.

#532 (comment)

@webmat:
Looping back on the earlier discussion, I don't think this should have a dual purpose from the outset (build version vs distro release version). If we need to track 3 aspects of versions, then perhaps we'll need 3 fields in total.

For now, please remove detailed_version from this PR and re-submit a new PR just with this field. We'll continue the discussion there.

Allows to store additional version information for an installed
package.

related to elastic#515
Comment on lines 82 to 83
different package types. For example it can contain a release string for RHEL packages, or the commit SHA of
a non-released package.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having one field for two different kinds of versions will be confusing.

I'm not sure how often a package will have both a build version and a release version, but it wouldn't surprise me that this is possible. Especially when we consider repos made available by third party organizations, making pre-release software available.

I think package.release_version would be very clear in its intent. The mere presence of this field may also help avoid having various event sources accidentally putting release versions in package.version.

If there's a need & a desire to also track the build version, I would be in favour of package.build_version as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think about software dependencies in applications I think it is not that uncommon to pin some third party libraries to a specific commit. My original intend to add this field was to have a possibility to capture this scenario. If adding release_version and build_version this would go into the build_version.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simitt Could you please clarify what do you mean by the following?

If adding release_version and build_version this would go into the build_version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to say that if a third party library is pinned to a specific commit rather than to a release version, the specific commit hash should not be stored as release_version, but would be stored in the field we are discussing here (package.extended_version).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see and I agree :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webmat are you suggesting to have package.version (already introduced with original package PR), package.release_version and package.build_version?

I still suggest we only add one more version information. As the concrete content can vary, I still think extended_version is a good fit.

Any additional thoughts @andrewkroh or @cwurm ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes my suggestion is indeed to add 2 more version fields, for a total of 3. I think it's important to avoid having two different possible meanings for one field.

  • This ensures analysis can be done on actual release versions such as RHEL's without them being polluted by commit IDs
  • This ensures analysis can be done on actual build IDs without them being polluted by weird release versions.
  • This finally ensures that edge cases such as perhaps pre-release or development packages (e.g. Fedora Core, or simply mainline build candidates anywhere) can record both the weird full release version and build version at the same time, if appropriate.

I think in many cases only one or two would be present, however, I agree with that part. But even in those cases, I think build_version and release_version will be much clearer in what they mean, than a more abstract extended_version.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webmat Could you please clarify the difference between package.version and package.release_version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need the build_version for runtime dependencies in APM. I renamed extended_version to build_version.

Regarding release_version - as pointed out above, this seems to be rather specific. To avoid confusion between version and release_version, I suggest to not introduce at all for now, and only add it when there is a concrete use case. (Except in case this would be filled already by anyone? cc @andrewkroh ).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergeyKleyman Here's a comment where I gave a concrete example based on MySQL on CentOS #560 (comment)

@simitt Yes that makes sense. We can flesh it out separately, if/when the need arise. Thanks for renaming :-)

Copy link

@SergeyKleyman SergeyKleyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@webmat webmat merged commit 0137c20 into elastic:master Nov 18, 2019
dcode pushed a commit to dcode/ecs that referenced this pull request Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants