Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type to package definition #587

Merged
merged 6 commits into from
Nov 18, 2019
Merged

Conversation

simitt
Copy link
Contributor

@simitt simitt commented Oct 16, 2019

Stores information about the type of the installed package, e.g. rpm,
dpkg, etc.

related to #515

Follow up from #532 to focus on adding package type information.


Summary of discussions so far:

#532 (review)

@cwurm:
Thanks for starting this! Left a few comments.

Two fields I wish I had added for Auditbeat's system/package that could also be relevant here:

  1. Package type or origin. Values: RPM, DPKG, Homebrew, NPM, Rubygem, etc.
  2. Path (where the package is located): e.g. /usr/local/Cellar/go/1.10.3/

#532 (comment)

@simitt
@cwurm how about adding package.manager for this kind of information?

#532 (comment)

@cwurm:
I'm not sure I have a strong opinion, but I think package.type would be more general. Not all packages are managed by a package manager - e.g. macOS applications under /Applications or Windows programs under C:\ Program Files both of which can be just a download.

#532 (comment)

@simitt
@cwurm I gave the package type | origin | manager discussion another thought. Reading package.type I think of .deb , .rpm, .tgz and not of the way the package was installed. Maybe using package.origin or package.install_type (we already use package.install_scope, so that would fit) would be preferable here. WDYT?

#532 (comment)

@cwurm
I struggle with that - I'm not sure we could always fill it with a sensible value. What would the value for a Homebrew package be? When installed, it's just a folder in /usr/local/Cellar. Or a Windows program? It's just an .exe in C:\Program Files - but that was not how it was installed and it's probably impossible to know what the installation method was (could have been an .msi, or just copying the .exe).

Stores information about the type of the installed package, e.g. rpm,
dpkg, etc.

related to elastic#515
@webmat
Copy link
Contributor

webmat commented Oct 22, 2019

Should we even consider the .exe and .app software in this discussion? I agree we want to be able to track those & report on them. But software that's not installed via a package should probably not dictate how we design the package.* field set :-)

I'd like us to continue the discussion on file type vs package manager a bit. My understanding is that in most cases, package information is collected by querying the local package DB or API. So as much as I'd like to have the ability to see details about tgz packages, I'm not sure we can collect that (short of scanning the disk)?

Also, I'd like it if we established more firmly what information is expected in the field. I think depending on the package type, people will vary between putting the file format vs the package manager name. So the more extensive our list of expected values, the less variance there will be.

Here's the packages I have in mind. Please contribute additional packages for other languages or platforms, and comment on what you think would make most sense in package.type.

File Type Package Manager Value in package.type
rpm rpm rpm
rpm yum rpm
dpkg apt dpkg
deb apt deb
? homebrew homebrew
egg pip egg
gem rubygems gem
dep dep (Golang) dep
npm npm npm
tar N/A

What are the Java, .Net and PHP packages?

cc @cwurm

@simitt
Copy link
Contributor Author

simitt commented Oct 29, 2019

@webmat I agree that we need to be more clear about which information we expect in this field. My main concern I tried to raise is that with the suggested approach we are mixing file type and origin or installation type, depending on what we can collect.

From the very beginning of this discussions:

Package type or origin. Values: RPM, DPKG, Homebrew, NPM, Rubygem, etc.

Seeing something like homebrew or rubygem as package.type would be confusing to me. If we want to collect this information then I think it should be stored in its own field. But as also pointed out not every package is installed by a package manager, maybe some more generic field name like package.installation_type would be more suitable for that.
So I guess my suggestion at this point is to introduce two fields: package.type and package.installation_type with basically the distinction you showed in your table.
I am not aware of how this information can be collected by different beats and apm agents so if one of the suggested fields can never be filled, we should not introduce it now.

@cwurm I have to admit I got a bit lost in this discussion, and would highly appreciate your input on @webmat 's and my suggestions.

@simitt
Copy link
Contributor Author

simitt commented Nov 11, 2019

@cwurm could you please give this a review?

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping back on this, I'd be in favour of progress over perfection. I think we can orient the field description to be around the file type, rather than package manager.

So: gem not rubygems, egg not pip, and rpm not yum.

If "homebrew" or "brew" shows up in there, I'm not terribly worried. It's an in-between in the sense that it's straightforward to gather that programmatically (that's the "packages" Auditbeat currently captures on MacOS), yet there isn't an actual "package" file per se, as the underlying distribution mechanism is Git.

I don't think we need to be explicit in the field descriptions around the subtleties we've discussed here. If confusion arises we can clarify later. Also if we find a need to specifically support the package manager or method of distribution, we can tackle that as another PR, with another field.

Final changes I see:

  • Let's add a changelog for this field addition
  • Adjust the description a bit, see other comment

Type of package.

When installed from a package manager, this would contain the package manager name, e.g. RPM, DPKG, Homebrew, NPM.
example: RPM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to guide in the right direction without getting into all of the subtleties discussed above, I think the following would work well enough:

      description: >
        Type of package.

        This should contain the package file type, rather than the package
        manager name. Examples: rpm, dpkg, brew, npm, gem.
      example: rpm

Note also that I switched all examples to lowercase. This aligns better with the rest of ECS.

@SergeyKleyman
Copy link

@webmat

I think we can orient the field description to be around the file type

Shouldn't we name the field file_type instead of type if this is its intended meaning?

@simitt
Copy link
Contributor Author

simitt commented Nov 18, 2019

I think we can orient the field description to be around the file type

Shouldn't we name the field file_type instead of type if this is its intended meaning?

IMO package type should reflect the file type of the package rather than the install manager, so from my perspective naming is fine now.

I added the latest suggestions, from my perspective this PR is final.

@SergeyKleyman
Copy link

Just to make sure I understand correctly how APM agents should fill type field: for example both .NET NuGet and Java Maven repositories provide packages in zip file format with predefined directory structure inside a package but they don't use .zip as package file name extension - they use .nupkg and .jar respectively. My question is - should Java agent put zip or jar in type field?

@webmat
Copy link
Contributor

webmat commented Nov 18, 2019

@SergeyKleyman It's meant to be the file extension (without leading dot), not the binary file format. So in the examples you give, "nupkg" and "jar" should be the values.

@webmat
Copy link
Contributor

webmat commented Nov 18, 2019

@simitt Could you update the example and lowercase the rpm please? We'll be good to merge after that, I think :-)

@@ -81,5 +81,5 @@
Type of package.

This should contain the package file type, rather than the package manager name.
Examples: rpm, dpkg, brew, npm, gem.
example: RPM
Examples: rpm, dpkg, brew, npm, gem, nupkg, jar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ the added examples, thx!

@webmat webmat merged commit 1de8293 into elastic:master Nov 18, 2019
dcode pushed a commit to dcode/ecs that referenced this pull request Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants