Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce display_name for threat.indicator #1998

Open
maxcold opened this issue Jul 26, 2022 · 23 comments
Open

Introduce display_name for threat.indicator #1998

maxcold opened this issue Jul 26, 2022 · 23 comments
Labels
enhancement New feature or request

Comments

@maxcold
Copy link

maxcold commented Jul 26, 2022

Summary

Introduce Display Name for an Indicator of Compromise (IoC) in the Threat Intelligence part of ECS

Motivation:

In the Threat Intelligence capabilities of the Security Solution, our team is working on the data grid for IoCs (Indicator of Compromise) where we have an "Indicator" column, which serves as a "Display name" for an indicator. The value of this column currently depends on the indicator type and every type has its own logic. The best example is the File indicator. It can have different hashes, eg. sha256, md5, etc. We take sha256 for display name when available, if not then md5 and so on. For other types there is not much logic involved, we take different values from different threat.indicator* attributes per type. The problem with this approach is that this Display Name is not available as an attribute on Elasticsearch documents coming from Threat Intelligence integrations, therefore users won't be able to filter by it and perform other standard operations they can do for existing ECS attributes. This can be solved partly with runtime fields, but we think adding Display Name to the schema might make sense and want to kick off the discussion about it
Features dependant on having display_name field:

  1. Filter in/out the Indicator of Compromise view by display_name
  2. Add Indicator display_name to a Timeline
  3. Create an Indicator of Compromise event renderer in Timelines
  4. Create a pre-built Timeline template for investigating IoCs

Detailed Design:

Introduce threat.indicator.display_name with the following logic per threat.indicator.type

file: 
threat.indicator.file.hash.* (sha256 | md5 | sha1 | sha224 | sha3-224 | sha256 | sha3-256 | sha384 | sha3-384 | sha512 | sha3-512 | sha512/224 | sha512/256 | ssdeep | tlsh | impfuzzy | imphash | pehash | vhash)

url: 
threat.indicator.url.original

email-addr: 
threat.indicator.email.address

email (subj?): 
the suitable field is missing, map to _id?
 
email-message:
the suitable field is missing, map to _id?

domain-name:
threat.indicator.url.domain

domain:
threat.indicator.url.domain

ipv4-addr: 
threat.indicator.ip

ipv6-addr: 
threat.indicator.ip

x509-certificate:
threat.indicator.x509.serial_number

x509 Serial:
threat.indicator.x509.serial_number

windows-registry-key:
threat.indicator.registry.key

autonomous-system:
threat.indicator.as.number

mac-addr:
threat.indicator.mac

unknown:
map to _id

Data examples can be found in AbuseCH, Anomali, Cybersixgill, MISP, OTX, Recorded Future and ThreatQ integrations

@maxcold maxcold added the enhancement New feature or request label Jul 26, 2022
@maxcold
Copy link
Author

maxcold commented Jul 26, 2022

@jamiehynds fyi, following up on our discussion

@peasead
Copy link
Contributor

peasead commented Jul 26, 2022

I think that threat.indicator.name would align with other ECS fieldsets (host.name ,user.name, etc.)

Also, here's the email ECS fieldset https://github.com/elastic/ecs/blob/main/schemas/email.yml, which could be nested under threat.indicator.

@maxcold
Copy link
Author

maxcold commented Jul 27, 2022

Thanks for the comments, I agree that threat.indicator.name is probably more consistent with the rest of the schema. Also agree that it probably make sense to nest the email fieldset, as currently only threat.indicator.email.address exists to my understanding at it doesn't match with https://github.com/elastic/ecs/blob/main/schemas/email.yml, (threat.indicator.email.address should be threat.indicator.email.from.address if I understand correctly when following the existing email schema)

@peasead
Copy link
Contributor

peasead commented Jul 27, 2022

Yep, you're following the schema properly!

Back story
We made some decisions early on regarding the directionality of some types of indicators when writing the threat.* ECS fieldset.

We felt that ECS fieldsets that incorporate directionality (source.ip, destination.domain, email.from.address, etc.) would lead to confusion when trying to get indicators into the right fields. As an example, if the indicator is 1.2.3.4 - is it command & control infrastructure or is it the source of a password spraying campaign. Should it be threat.indicator.source.ip : 1.2.3.4 or threat.indicator.destination.ip : 1.2.3.4? In the email example, is this the source of a phishing email or the reply-to address?

Contextually, it could be both - so if threat provider A marked it as a source and threat provider B marked it as a destination; you could have duplicate threat indicator matches or worse, contextually incorrect assumptions based on how someone viewed the directionality.

We opted to avoid the confusion of directionality by not including it and doing threat.indicator.ip : 1.2.3.4, threat.indicator.domain, threat.indicator.email.address, etc and allowing an analyst to determine //what// happened using other fields populated during the enrichment - like network.direction : ingress|egress

Commentary
I think if we wanted to reapproach directionality, we could do that, but having looked at the feed data over time, I think directionality would be difficult.

@maxcold
Copy link
Author

maxcold commented Aug 2, 2022

@peasead Thanks a lot for the Back Story, it answers a lot of questions that I had as I wasn't really thinking about the directionality of the indicators. It makes total sense! So I think it should stay the way it is now, meaning for email-addr IoC type as just threat.indicator.email.address.
But now I'm not sure what part of the description you were commenting with this note

Also, here's the email ECS fieldset https://github.com/elastic/ecs/blob/main/schemas/email.yml, which could be nested under threat.indicator.

Can you clarify so we are on the same page?

@peasead
Copy link
Contributor

peasead commented Aug 3, 2022

I think you can disregard that. I didn't fully grasp what you were asking until I started the larger response.

Sorry!

@maxcold
Copy link
Author

maxcold commented Aug 23, 2022

@jamiehynds can you help me find the right people to ping on this issue so we move forward with it?

@jamiehynds
Copy link
Contributor

@maxcold based on the discussions above with @peasead, do you think threat.indicator.name is a more suitable fit than the original threat.indicator.display_name proposal?

Could you also provide a proposal for the field description and allowed values, which we'd include in the ECS documentation? As an example, here's the description and allowed values for an upcoming addition to event.category - #2028 (comment)

@ebeahan given that this proposal is a relatively minor change, I'm assuming an RFC isn't warranted, but pinging you just incase you feel otherwise.

@djptek
Copy link
Contributor

djptek commented Aug 24, 2022

@maxcold is it likely that e.g. threat.indicator.name could have multiple values for distinct indicators within a single event?

If, so, we'd perhaps want to consider adding

threat.enrichments.indicator.name

where enrichments is an array containing multiple indicators as well as

threat.indicator.name

@maxcold
Copy link
Author

maxcold commented Aug 24, 2022

@jamiehynds yes, I think it makes sense to have threat.indicator.name for consistency with the rest of ECS
Description: "The display name of the Indicator of Compromise in UI friendly format"
Allowed values: there is a mapping between the type of IoC and which field from the document should be used for the name field. How should I go about allowed values in this case?
You mentioned that the change is minor but we not only want to introduce this new field, but also want it to be populated for TI integrations based on the mapping logic I added into description. Is it in the scope of this issue or I will need to add a new issue to implement the logic in the Integrations?

@djptek good point, I think it makes sense to add threat.enrichments.indicator.name too in addition to threat.indicator.name which follows the same logic and share the same Description/Allowed Values and serves the IoC name when added as enrichment to Alerts for example

@djptek
Copy link
Contributor

djptek commented Aug 24, 2022

You can specify expected values in the schema yml for a field, see e.g. event.category

@maxcold
Copy link
Author

maxcold commented Aug 25, 2022

@djptek thanks for providing an example. One question, you linked to expected_event_types attribute, did you mean to link to allowed_values? I'm just not sure how expected_event_types is relevant here.
As for allowed_values the problem is that the proposed threat.indicator.name is not an enum, the value and the type of this value depends on the threat.indicator.type. Here are some examples
For an `'ipv4-addr' indicator

{
  threat:
  {
    indicator: {
      _id: '123',
      type: 'ipv4-addr'
      ip: '1.1.1.1'
    }
  }
}

"to be" state with the new field populated

{
  threat:
  {
    indicator: {
      _id: '123',
      name: '1.1.1.1'
      type: 'ipv4-addr'
      ip: '1.1.1.1'
    }
  }
}

for a file type indicator

{
  threat:
  {
    indicator: {
      _id: '123',
      type: 'file'
      file: { hash: {'md5': 'md5_hash', sha256: 'sha256_hash'}}
    }
  }
}

"to be" state with the new field populated

{
  threat:
  {
    indicator: {
      _id: '123', 
      name: 'sha256_hash'
      type: 'file'
      file: { hash: {'md5': 'md5_hash', sha256: 'sha256_hash'}}
    }
  }
}

@djptek
Copy link
Contributor

djptek commented Aug 25, 2022

Hi @maxcold sorry, I gave completely the wrong example there. I intended to give an example for expected_values

expected_values (optional): An array of expected values for the field. Schema consumers can validate integrations and mapped data against the listed values. These values are the recommended convention, but users may also use other values.

From your examples:

  • name: 'sha256_hash'
  • name: '1.1.1.1'

I think expected_values might be a good fit

@maxcold
Copy link
Author

maxcold commented Aug 25, 2022

@djptek no worries, I didn't know about expected_values, thanks for bringing it up! We can definitely provide some example values in the expected_values, but after looking at how expected_values is currently used https://github.com/elastic/ecs/search?q=expected_values I'm not sure if it is a good fit. It seems like it describes the exact values that can appear in a field so that the consumers can build validation for it. In our case, we can only provide example values, which might confuse the schema consumers. But I might be completely wrong about expected_values, happy to provide more example fields if needed

@djptek
Copy link
Contributor

djptek commented Aug 25, 2022

@maxcold expected_values would work if you can provide examples, they doesn't need to be exhaustive and are not intended for validation

This is not the same as allowed_values, which should be validated always

@maxcold
Copy link
Author

maxcold commented Aug 26, 2022

@djptek got it, thanks!
Here is the list of example values for threat.indicato.name field

5.2.75.227
2a02:cf40:add:4002:91f2:a9b2:e09a:6fc6
https://example.com/some/path
example.com
373d34874d7bc89fd4cefa6272ee80bf
b0e914d1bbe19433cc9df64ea1ca07fe77f7b150b511b786e46e007941a62bd7
[email protected]
HKLM\\SOFTWARE\\Microsoft\\Active
13335
00:00:5e:00:53:af
8008

@jamiehynds anything else I can do to help moving it forward?

@maxcold
Copy link
Author

maxcold commented Sep 20, 2022

hi @jamiehynds , as we are coming closer to the 8.6 release cycle anything we can help with to get this on the roadmap for the release?

@maxcold
Copy link
Author

maxcold commented Nov 4, 2022

@ebeahan @jamiehynds @djptek hey folks, what can our team do to help move this forward?

@ebeahan
Copy link
Member

ebeahan commented Nov 10, 2022

@maxcold sorry for missing the ping before.

Is this summary still the direction we're taking: #1998 (comment)?

If so, the next steps would be for someone on your team to open a PR with the requested changes, and we'll review and discuss further as needed.

@maxcold
Copy link
Author

maxcold commented Nov 14, 2022

@ebeahan yes, that's still the idea. I will add an issue of creating a PR to ECS schema to our backlog then. What about changes in the ti_* integrations? It would be good if these integration populate the field automatically, how do we make this happen?

@maxcold
Copy link
Author

maxcold commented Nov 21, 2022

@ebeahan btw what is usually the approach for backfilling the data if a new ECS field is introduced (or if there is another change in the schema)? Is there a common way to handle such cases? Specifically for threat.indicator.name it would be good to provide a way for users to add this field to the already existing data. Is it a good idea in general? how should we handle this?

@ebeahan
Copy link
Member

ebeahan commented Nov 22, 2022

@maxcold there's not a common approach I'm familiar with.

Typically when a field is introduced in ECS, the change is made to the data source or integration to start populating that field. New events and indices will have the field, but existing data will not.

@maxcold
Copy link
Author

maxcold commented Nov 24, 2022

got it, thanks!

lgestc pushed a commit that referenced this issue Jan 5, 2023
This PR addresses issue #1998

Co-authored-by: Kylie (Geller) Meli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants