-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Created the Vulnerability Schema #581
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for opening this, @peasead!
Is it fair to say that ingesting one scan/report should generate multiple events that populate vulnerability.*
? In other words, each event captures one finding?
If so, should we have a keyword field to associate each event with a report/scan ID?
I had over filtered Github notifications. I'm reviewing these comments now. Sorry for the delay. |
It could make sense to have an arbitrary ID field. While many of the findings have CVE IDs, not all do. There are other vulnerability identifier that show up every now and then. Here's an example I pulled from Snyk On that note as well, being able to link to one or more reference URLs would be helpful. In the case of CVE IDs you could easily just link to MITRE or NVD. Other identifiers may be less obvious and it's possible some identifier won't even have a central location. |
Should we ditch the CVE/CVSS specifics as well and leave it as: vulnerability.score.base vulnerability.classification e.g. CVSS |
I see this just closed but the only additional comment I have is that most vulnerability scan vendors include a unique vulnerability id for their scanner. For example Qualys has a |
good idea vulnerability.scanner.id or we could use the observer fields to detail scan vendor, etc. |
Yeah I like the idea of supporting an additional arbitrary ID. We need to find a good name for that, though. I don't think |
Awesome ideas @dainperkins and @stiltz I took those changes and updated the PR. |
Am I reading this correctly that the vulnerability fields end up as items nested under host/source/destination/observer, etc? |
@dainperkins The top level is |
generated/csv/fields.csv showed vulnerability as both top level and nested e.g. client.vulnerability... Wasn't sure if that was on purpose, an option, or mistake. vulnerability.name: Flash buffer overflow vs. could just be I was thinking about it differently but for reporting vunerabilities I was thinking the vulnerability info would all be top level, with teh addition of a [host] array or similar to define the vulnerability to vulnerable system mappings |
@dainperkins No, just a matter of re-running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a much more detailed review, thanks for working on this @peasead :-)
- As we already discussed, please run
make
prior to each commit, so generated files are updated - Please add a changelog entry to
CHANGELOG.next.md
. You can get inspiration fromCHANGELOG.md
. - Please break off long definitions at around 80-90 chars. Double return will result in new paragraphs, but single return will keep the paragraph together in the various generated artifacts
See also a lot of small adjustments provided as comments below.
I really like where this is already, content is top notch! My comments are for minor adjustments mostly.
schemas/vulnerability.yml
Outdated
|
||
example: CVE | ||
|
||
- name: url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this field to reference
. We're in the process of establishing a pattern where reference url fields are always named .reference
:-)
So far: threat.tactic.reference
, threat.technique.reference
, and soon package.reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧐 you renamed the field above 🙂
Here's what I meant:
vulnerability.url
should becomevulnerability.reference
- The field you have renamed used to be named
vulnerability.enumeration
. I'm ok with keeping that original field name, as long as the description is fleshed out a little, like described here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that. I've made the adjustments.
How do we expect people to use the If the idea is indeed to import scan findings, should we explore support for two related kinds of documents?
Or are we good with just documents per findings + the report ID? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more things to address, see review comments below.
I would also like clarifications on a few questions:
- @peasead do you think we need to be able to ingest a document representing the report itself, in addition to each finding? Or are we good with the report details just being referenced via
vulnerability.report_id
? - Do we need
vulnerability.status
? See longer version of the question here - There's a few fields that I would like to know whether we should expect a single value or an array of values, please let me know your thinking about each of them:
vulnerability.category
vulnerability.scanner.id
Additional note on the array question above. Even if some scanners make a decision to pick the "best match" (or highest severity) when assigning a category or vuln ID to a finding, that's not the standard we have to match. What we have to cater to is if any scanner can associate multiple categories or return multiple CVE IDs for a finding, then we should make these field arrays.
Note that there's no direct support for arrays in ECS right now, but it's coming. So any of those should be arrays, we would simply mention it in the description, and give an example that's actually an array (example: '[CVE-2019-0001]'
).
schemas/vulnerability.yml
Outdated
|
||
example: CVE | ||
|
||
- name: url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧐 you renamed the field above 🙂
Here's what I meant:
vulnerability.url
should becomevulnerability.reference
- The field you have renamed used to be named
vulnerability.enumeration
. I'm ok with keeping that original field name, as long as the description is fleshed out a little, like described here.
I think that the
I think that this is probably scanner/UI dependent and if we removed it, we'd be okay. To whit, I'm not sure other scanners beyond Qualys track this information, so it may be very scanner dependent. I'll remove this.
I think
While I agree that we don't want to be in the business of assigning a category, we're collecting the data that is being presented by the scanner. So if the scanner says "Firewall", we should take the observer at its word vs. trying to maintain a matrix. |
By additional entry here, do you mean a distinct field? In other words, would they have a "main category" and a list of other applicable categories?
I think my comment was misunderstood there. I don't want to override or map anything. What I'm saying is much more simple: if any scanner is going to give us an array of categories or multiple CVEs for a finding, I want our field to be an array that can take it all in, as is. This would also mean that in most cases when we have a single category/CVE id, we'd have arrays of 1 item, but that's fine. |
I'll take care of the CHANGELOG.next.md conflict at merge time, btw. No need to worry about that. If you fix now, it may complain again after the next PR gets merged 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 more things I noticed. Sorry for the back & forth here ;-)
And we still need to close on whether .category
and .id
should be array fields.
short: Fields to describe the vulnerabilty relevant to an event. | ||
description: > | ||
The vulnerability fields describe information about a vulnerabilty that is | ||
relevant to an event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo first "vulnerabilty" => "vulnerability", then a more philosophical point:
The description of the field set makes it sound like users are expected to enrich "normal" events with vulnerability information. Is that our expectation?
My expectation would be more along the lines of users directly importing scan findings. In other words, this isn't another kind of event being enriched with this info. The event/document coming in is literally one finding from the report, which is getting imported for correlation in the SIEM.
If you agree with this, we should address this directly. Perhaps:
short: Fields to describe a vulnerability finding.
description: >
The vulnerability fields describe information about a vulnerability that was
reported by a scan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed typo.
I think adjusting this description makes sense. That said, just as some context: #581 (comment)
To address your question on how the
In the second part of your question regarding documents per findings and report ID, I think that having the report ID is sufficient to guide the analyst to a place to get additional information i.e. the scanner vs. importing the entire scan report. I see a lot of products in the intelligence space that attempt to replicate "all the things" instead of providing a launch point for analysts. Maybe we could adjust Just my $0.02 from some time in the field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 more things I noticed. Sorry for the back & forth here ;-)
And we still need to close on whether
.category
and.id
should be array fields.
I don't think they should be arrays.
id
, the actual CVE #, are assigned 1 at a time (https://cve.mitre.org/about/faqs.html#what_is_cve). I wasn't able to find a vulnerability with more than 1 CVE, so I think the id
wouldn't be an array field. I downloaded the entire 2019 CVE database and all of the vulnerabilities only had 1 CVE (id).
category
isn't a MITRE created field (like CVE is). I looked at the reference that I'd used in the field-set (from Qualys - https://qualysguard.qualys.com/qwebhelp/fo_portal/knowledgebase/vulnerability_categories.htm) and it looks like everything is 1:1, so I don't know that we'd need an array here either. That said, is there any harm of making category
an array to err on the side of caution?
Open to discuss, obviously, if I'm misunderstanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed thoughts, @peasead!
In the second part of your question regarding documents per findings and report ID, I think that having the report ID is sufficient to guide the analyst to a place to get additional information i.e. the scanner vs. importing the entire scan report. I see a lot of products in the intelligence space that attempt to replicate "all the things" instead of providing a launch point for analysts.
- I'm also good on only capturing the report ID and not capturing "all the things". I just wanted to make sure this was also how you saw it. Looks like we're on the same page. 👍
Maybe we could adjust vulnerability.reference to be a link to the scanner report instead of an external reference? <-- I like this idea.
- I think we have a need for both URLs.
- For now I would say
.reference
should remain a URL that points to general documentation about a thing, in this case the CVE. This is the semantics we've been establishing, so far for "reference URL" fields. - I also really love the idea of capturing the URL an analyst would want to visit, in order to view the actual finding in their vulnerability scanner. It's another type of URL we will need in a few places (e.g. endpoint alerts), but we don't have a good name for that field yet. So IMO we leave this out of this PR, and add it as a follow-up PR.
- For now I would say
On category and ID as arrays:
My thinking there isn't whether a published vulnerability could have more than one ID or category necessarily. It's rather "can one scanner finding correspond to more than one CVE", or could it have more than one category (orthogonal ones: "Windows" & "Firewall", or a wide problem: "SUSE" + "Debian" + "RHEL" ).
I do think it would be better to go on the safe side here, and the effort is pretty trivial. Elasticsearch actually doesn't care if docs contain a single value vs an array. The reason in ECS we try to flesh this out more explicitly is that the pipelines handling the events or generated language libs like ecs-dotnet have to know, in order to do the right thing.
We could even promote lightweight semantics, where if a scanner has such a thing as "main category" and "other categories", the main category should be first in the array.
To close on this, I'm 99% sure it's useful for category; for ID I'm not sure yet, I'm starting to think not. Perhaps the way scanners are structured means each CVE has its own detection, and a given finding will always only have one CVE? In other words the same piece of software would simply have multiple findings (one per CVE) if applicable...
See PR comment, for a concrete suggestion on what we can do to make category into an array.
schemas/vulnerability.yml
Outdated
or Firewall). | ||
For example (https://qualysguard.qualys.com/qwebhelp/fo_portal/knowledgebase/vulnerability_categories.htm) | ||
|
||
example: Firewall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this an array, here's what I have in mind. Feel free to adjust:
description: >
The type of system or architecture that the vulnerability affects. These may be
platform-specific (for example, Debian or SUSE) or general (for example, Database
or Firewall).
For example (https://qualysguard.qualys.com/qwebhelp/fo_portal/knowledgebase/vulnerability_categories.htm)
This field must be an array.
example: '["Firewall"]'
I don't know if we need to expand on main category being first + other categories coming next. We can add that later, if there's confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm also good on only capturing the report ID and not capturing "all the things". I just wanted to make sure this was also how you saw it. Looks like we're on the same page. 👍
Roger roger.
Maybe we could adjust vulnerability.reference to be a link to the scanner report instead of an external reference? <-- I like this idea.
I think we have a need for both URLs.
- For now I would say
.reference
should remain a URL that points to general documentation about a thing, in this case the CVE. This is the semantics we've been establishing, so far for "reference URL" fields.- I also really love the idea of capturing the URL an analyst would want to visit, in order to view the actual finding in their vulnerability scanner. It's another type of URL we will need in a few places (e.g. endpoint alerts), but we don't have a good name for that field yet. So IMO we leave this out of this PR, and add it as a follow-up PR.
The more I thought about this, I was coming to this conclusion as well to the thought of scanner.reference
; but I agree we can follow-up with an enhancement to the schema when we've had more time for it to bake.
On category and ID as arrays:
My thinking there isn't whether a published vulnerability could have more than one ID or category necessarily. It's rather "can one scanner finding correspond to more than one CVE", or could it have more than one category (orthogonal ones: "Windows" & "Firewall", or a wide problem: "SUSE" + "Debian" + "RHEL" ).
I do think it would be better to go on the safe side here, and the effort is pretty trivial. Elasticsearch actually doesn't care if docs contain a single value vs an array. The reason in ECS we try to flesh this out more explicitly is that the pipelines handling the events or generated language libs like ecs-dotnet have to know, in order to do the right thing.
We could even promote lightweight semantics, where if a scanner has such a thing as "main category" and "other categories", the main category should be first in the array.
To close on this, I'm 99% sure it's useful for category; for ID I'm not sure yet, I'm starting to think not. Perhaps the way scanners are structured means each CVE has its own detection, and a given finding will always only have one CVE? In other words the same piece of software would simply have multiple findings (one per CVE) if applicable...
See PR comment, for a concrete suggestion on what we can do to make category into an array.
I like this and will make the adjustment declaring that this must be an array and formatting the example properly for .category
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for all of the adjustments!
The CLA check is failing because git was inadvertently set with a local email. I'm merging anyway, as Andrew has signed the CLA and adjusted his git config with a correct email address.
The email that will show in the merge commit has been used to sign the CLA.
The thought is that this would be used for vulnerability scanners or associated projects (Qualys, Nexpose, Nessus, OpenVas, VulnWisperer, etc.) so they can send their data to Elasticsearch with ECS.
Some suggested fields were from #113
Fields created for the
vulnerability-*
schema: