Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RST Cloud - Report Hub] Several issues #2767

Closed
Lhorus6 opened this issue Oct 7, 2024 · 5 comments · Fixed by #2864
Closed

[RST Cloud - Report Hub] Several issues #2767

Lhorus6 opened this issue Oct 7, 2024 · 5 comments · Fixed by #2864
Assignees
Labels
bug use for describing something not working as expected community support use to identify an issue related to feature developed & maintained by community. solved use to identify issue that has been solved (must be linked to the solving PR) to verify use to identified for Verified
Milestone

Comments

@Lhorus6
Copy link
Contributor

Lhorus6 commented Oct 7, 2024

Description

Many issues have been identified. As is, the connector can be deployed in production but its quality is very low.

Blocking problem

  • Some relationships are poorly modeled
    • “related to” relationships between countries/sector and Intrusion sets/malware, instead of “targets” relationships
    • “related to” relationships between Attack Patterns and Intrusion sets, instead of “uses” relationships

Screenshot 2024-10-07 165337

Non-blocking problem

  • Do not create Observables associated with Indicators

Screenshot 2024-10-07 164537

  • An error "'NoneType' object is not subscriptable" is raised at each work:

image

  • External references are attached to all entities, but no reports are created. What would be best would be:
    1. Create a report
    2. Attach the External Reference to the report
    3. Create all entities and relationships (as currently done)
    4. Put all entities in the report.

Other improvements

  • It does not put markings on Notes and Organization it creates:

Screenshot 2024-10-07 163944

  • It does not fill in the "Author" fields of the Attack Patterns and Notes it creates:

Screenshot 2024-10-07 163218

Environment

OCTI 6.3.5

Reproducible Steps

Steps to create the smallest reproducible scenario:

  1. Deploy the connector
@Lhorus6 Lhorus6 added bug use for describing something not working as expected needs triage use to identify issue needing triage from Filigran Product team labels Oct 7, 2024
@romain-filigran
Copy link
Member

Hello @k1r10n : Could you take a look at @Lhorus6 requests for improvements?

@romain-filigran romain-filigran added community support use to identify an issue related to feature developed & maintained by community. and removed needs triage use to identify issue needing triage from Filigran Product team labels Oct 14, 2024
@helene-nguyen helene-nguyen added the to verify use to identified for Verified label Oct 14, 2024
@k1r10n
Copy link
Contributor

k1r10n commented Oct 14, 2024

Hi @Lhorus6! thanks for your comments!

Based on these comments and previous experience with clients, I believe you may still have the AlienVault connector enabled. When both connectors are enabled, and data from OTX arrives after reports are imported from the Report Hub (some reports we provide are also covered by OTX), this leads to a merge in OpenCTI, which creates the extra 'related-to' entries and makes everything look weird.

Answering to all comments. Feel free to reach out [email protected]

  1. “related to” relationships between countries/sector and Intrusion sets/malware, instead of “targets” relationships

We provide 'targets' relationships between intrusion sets/malware and countries/sector where this can be extracted from the context by the engine. Sometimes countries are mentioned in the context that a particular IP is located in 'Country A'. This does not mean that the intrusion set is 'related-to', 'originated-from', or 'targets' this country. All of these 3 relationships are supported by the engine but not always are set.

  1. “related to” relationships between Attack Patterns and Intrusion sets, instead of “uses” relationships

When possible the engine adds 'uses' between the 'intrusion-sets' and 'attack-patterns'. This is a standard feature. Attaching an example.
20241013_ctfiot_com_report_0x9e6be797.json
image

  1. Do not create Observables associated with Indicators

If a TI report contains indicators of compromise, we create indicators based on them. If the report includes 'noisy' indicators (such as an indicator for putty or other tool that can be dropped by an attacker) or, for example, well-known domains/URLs used for geo-location (when, let's say, a stealer checks if it is allowed to operate in a certain country), we create these objects as Observables. So far this logic was ok for the clients and we did not have a request to duplicate each indicator with a corresponding observable.

  1. An error "'NoneType' object is not subscriptable" is raised at each work:

Please provide more information on which particular object is not being created, including the error description. This is not normal. Should be 0 errors.

  1. External references are attached to all entities, but no reports are created. What would be best would be:

Create a report
Attach the External Reference to the report
Create all entities and relationships (as currently done)
Put all entities in the report.

This is exactly how it is done. Each human-readable report is transformed into a STIX bundle, which includes a Report object with notes and all other associated objects. The only thing is that we keep referencing all objects as then you can see in what reports you can find information on what object. It’s possible that, in step 4, the actual Report object is not created.

  1. It does not put markings on Notes and Organization it creates:

We have added markings as per your request. Please re-download.

  1. It does not fill in the "Author" fields of the Attack Patterns and Notes it creates:

Thanks. We’ve added 'Author' to our unique patterns and Notes. For the T* patterns, clients usually sync them from MITRE anyway, and our values are only used for mapping with the MITRE definitions.

@Lhorus6
Copy link
Contributor Author

Lhorus6 commented Oct 19, 2024

Hi @k1r10n,
Thanks a lot for your detailed reply.

This leads to a merge in OpenCTI, which creates the extra 'related-to' entries and makes everything look weird

I filtered on “Creator”, which means that all objects (entities and relationships), whether or not they were merged afterwards, have been created by your connector. The merge does not create extra object, it simply deduplicates.

  1. Sometimes countries are mentioned in the context that a particular IP is located in 'Country A'. This does not mean that the intrusion set is 'related-to', 'originated-from', or 'targets' this country.

100% agree. If you're already doing 'originated-from', or 'targets' relationships when you have the information, that's perfect. It only requires to stop creating 'related-to' relationships in this case.

  1. When possible the engine adds 'uses' between the 'intrusion-sets' and 'attack-patterns'.

That's great, in this case I'll give you the same answer as before. Just avoid making any more 'related to' relationships.

  1. So far this logic was ok for the clients and we did not have a request to duplicate each indicator with a corresponding observable.

A good practice is to always create an Observable when you create an Indicator (the reverse is not true). This comes from the fact that functionalities are linked to Observables. For example, many enrichment connectors run on Observables, and not on Indicators. Having only Indicators can limit users.

It's not a duplication. The two entities are closely related, but they represent different things and meet different needs.

  1. Please provide more information on which particular object is not being created, including the error description.

Indeed, sorry, I should have provided it in the first place. On rechecking, it appears that I no longer have any errors. When I look in the logs of my container at the time I created this issue, I find this log (perhaps unrelated) for 4 days in a row:

Failed to download and save entry 20241009_rt-solar_ru_report_0xde561eb0 as PDF. 404 Client Error: Not Found for url: https://api.rstcloud.net/v1/reports?id=20241009_rt-solar_ru_report_0xde561eb0&format=pdf

I think we can consider this point resolved ✅

  1. This is exactly how it is done.

You're right, I now have reports with all the information. Indeed, it's possible that it's linked to point 4.

Each object is linked to a report, so the External Reference may only be applied to the report. Both ways are acceptable.

Screenshot 2024-10-19 152506

Solved too ✅

  1. We have added markings as per your request.

Thank you!

Solved ✅

  1. We’ve added 'Author' to our unique patterns and Notes

Thank you!

Solved ✅

TLDR

All that remains are points 1, 2 and 3. Points 1 and 2 are blocking because they create relationships that create "noise" in the database and can cause misunderstanding for users and potentially distort dashboard values.

@k1r10n
Copy link
Contributor

k1r10n commented Oct 20, 2024

@Lhorus6 Regarding points 1 and 2, I believe I need to elaborate more. The 'related-to' relationship is not just noise; it serves a purpose. It just so happens that many integrations overuse 'related-to' everywhere.

Which relationship would you choose for a text like 'Malware A avoids infecting Russia, Belarus, and other CIS countries'? This might lead you to conclude that the authors of the malware are possibly from the CIS, but in doing so, you lose the connection to the named countries and the fact that the malware does not attack these particular entities in this region. I'm sure you can think of several other cases where neither 'originated-from' nor 'targets' would be appropriate. Also, 'targets' and 'originated-from' are basic types. We will further expand our engine capabilities to automatically extract relationships like 'authored-by', 'variant-of', 'downloads', 'drops', etc., but for now, we have recorded them as 'related-to'. Additionally, the STIX 2.1 taxonomy cannot fully cover all types of relationships, so 'related-to' serves as a fallback in some cases. We could create custom types (and the standard allows for that), but then we would face interoperability issues, as different Threat Intel Platforms would need to interpret these new 'custom' relationships. It would be nice to get your view on this.

Our RST Report Hub engine isn't perfect, as it is a machine that parses near all public threat intelligence articles and reports, and there are fluctuations. However, if the same task was assigned to 5 analysts, I doubt they would avoid making any mistakes or complete the task at the same speed (if they could even complete it at all, given there are more than 40,000 articles on the topic annually). Additionally, the cost of having the machine pre-parse the data is about 30 times cheaper than hiring 5 people. So, we currently aim to handle the heavy lifting and leave really detailed refinements to the user.

Regarding point 3, I think we can add a configuration option for the connector: create_observables: True|False in the next release. It hasn't been a priority for our clients so far, but it’s cheap to add, and I believe there are cases where people may want it.

@Lhorus6
Copy link
Contributor Author

Lhorus6 commented Oct 23, 2024

It just so happens that many integrations overuse 'related-to' everywhere.

Indeed, 'related-to' is a kind of garbage relationship, as it exists between any type of entity.

Which relationship would you choose for a text like 'Malware A avoids infecting Russia, Belarus, and other CIS countries'?

My opinion would be: no relation at all.

for now, we have recorded them as 'related-to'

My opinion is that if you choose to represent different information using the same type of relationship, these relationships will be useless. This is because when you'll see these relations, you'll have no idea what they mean, as they can represent many different things + statistics on them won't make sense. So my advice would be not to create them.

What do you think?

We could create custom types (and the standard allows for that), but then we would face interoperability issues, as different Threat Intel Platforms would need to interpret these new 'custom' relationships.

I couldn't agree with you more!!

Our RST Report Hub engine isn't perfect, as it is a machine that parses near all public threat intelligence articles and reports, and there are fluctuations

There is no subject at this level. I'm not familiar with your tool, but looking at the data that comes in, it seems effective yes. I'm only talking about modeling choices here, not the efficiency of your tool.

Regarding point 3, I think we can add a configuration option for the connector: create_observables: True|False in the next release.

It could be perfect. Many connectors leave the choice like this. So, the creation of the Observable, and the relationship "Indicator -> Based on -> Observable".

@helene-nguyen helene-nguyen self-assigned this Jan 16, 2025
@helene-nguyen helene-nguyen added this to the Release 6.4.8 milestone Jan 16, 2025
@helene-nguyen helene-nguyen added the solved use to identify issue that has been solved (must be linked to the solving PR) label Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug use for describing something not working as expected community support use to identify an issue related to feature developed & maintained by community. solved use to identify issue that has been solved (must be linked to the solving PR) to verify use to identified for Verified
Projects
None yet
4 participants