-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Import Document AI] Implement Import Document AI connector #3466
base: master
Are you sure you want to change the base?
Conversation
33b3b50
to
090abc4
Compare
83e3d62
to
b6c2782
Compare
410503d
to
de91597
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cnstant Thanks for your work!
Your contribution follows the current connector template, and the implementation and documentation are clear.
Could you havea look at the comments on example files, error handling, and file storage, please?
I also left a few (optional) suggestions to make the logging tools and message formatting more consistent.
It'd be nice if you could provide a fake web service respons json example, I’d then be able to quickly test it by simulating an API server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest you add explicitly used package and do not only rely on pycti sub dependencies
requests
stix2
yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about versions ? should I stick to the versions in the package, or no version at all to make it more flexible ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your response
As the end user of the library, I think you must specify the exact (compatible) versions you need/tested with your use case. But as you pointed out, pycti currently does not provide loose version requirements for its dependencies. This is something we are correcting step by step. (see OpenCTI-Platform/client-python#859)
I think you can fix your versions to :
requests==2.32.3
stix2==3.0.1
PyYAML==6.0.2
internal-import-file/import-document-ai/src/reportimporter/core.py
Outdated
Show resolved
Hide resolved
internal-import-file/import-document-ai/src/reportimporter/core.py
Outdated
Show resolved
Hide resolved
internal-import-file/import-document-ai/docker-compose.yml.sample
Outdated
Show resolved
Hide resolved
internal-import-file/import-document-ai/src/reportimporter/core.py
Outdated
Show resolved
Hide resolved
internal-import-file/import-document-ai/src/reportimporter/core.py
Outdated
Show resolved
Hide resolved
internal-import-file/import-document-ai/src/reportimporter/core.py
Outdated
Show resolved
Hide resolved
Here are all the possible json responses at the moment. You have the return status in the file name, and the content is the actual response.json(). Let me know if it's enough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please find my last remarks. Thank you for the response examples you provide, I get back to you as soon as I have tested them.
version: '3' | ||
services: | ||
connector-import-document-ai: | ||
image: connector-template:6.3.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry i was not clear in my previous comment. I think you should replace connector-template with the future image name
image: connector-template:6.3.3 | |
image: opencti/connector-import-document-ai:6.3.3 |
# On Windows, the invalid characters are different, so the behavior is not the same as Linux | ||
# It only happens with free text on local setup running on windows. Never on prod. | ||
|
||
os_system = os.name | ||
|
||
# If windows detection, replacement of invalid characters | ||
if os_system == "nt": | ||
file_name = re.sub(r'[\\/:*?"<>|]', "_", file_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not necessary and you could remove it as you are not saving the file in the filesystem anymore.
# On Windows, the invalid characters are different, so the behavior is not the same as Linux | |
# It only happens with free text on local setup running on windows. Never on prod. | |
os_system = os.name | |
# If windows detection, replacement of invalid characters | |
if os_system == "nt": | |
file_name = re.sub(r'[\\/:*?"<>|]', "_", file_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your response
As the end user of the library, I think you must specify the exact (compatible) versions you need/tested with your use case. But as you pointed out, pycti currently does not provide loose version requirements for its dependencies. This is something we are correcting step by step. (see OpenCTI-Platform/client-python#859)
I think you can fix your versions to :
requests==2.32.3
stix2==3.0.1
PyYAML==6.0.2
for entity_id in entities_ids: | ||
# Incident attributed-to Threats | ||
if ( | ||
entity_id.startswith("threat-actor") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Seems unused for now as the AI WebService does not extract threat actors
if ( | ||
entity_id.startswith("threat-actor") | ||
or entity_id.startswith("intrusion-set") | ||
or entity_id.startswith("campaign") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Seems unused for now as the AI WebService does not extract campaign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you say it is correct to represent connector behavior like this:
Case 1 : from Data/Import UI
graph LR
subgraph FileStorage
direction LR
pdf["PDF file"]
text["Text file"]
html["HTML file"]
md["MD file"]
end
subgraph AIWebService
end
subgraph OpenCTI
direction LR
subgraph CreatedReport
subgraph AddedDomainObjects
direction LR
OpenCTICountry[Country]
OpenCTIMalware[Malware]
OpenCTIIntrusionSet[Intrusion Set]
OpenCTIVulnerability[Vulnerability]
OpenCTIAttackPattern[Attack Pattern]
end
subgraph AddedObservables
direction LR
OpenCTIAutonomousSystem[Autonomous System]
OpenCTIDomainName[Domain Name]
OpenCTIEmailAddress[Email Address]
OpenCTIPv4Address[IP V4 Address]
OpenCTIPv6Address[IP V6 Address]
OpenCTIMACAddress[MAC Address]
OpenCTIURL[URL]
OpenCTIFile[File]
OpenCTIWindowsRegistryKey[Windows Registry Key]
end
end
end
FileStorage ==> AIWebService
AIWebService ==> AddedDomainObjects & AddedObservables
Case2 : From Container UI (Report/Task/Case-Incidents, etc)
graph LR
subgraph FileStorage
end
subgraph AIWebService
end
subgraph OpenCTI
direction LR
subgraph TriggeringContainer[Triggering Container]
subgraph AddedDomainObjects[Added Domain Objects]
end
subgraph AddedObservables[Added Observables]
end
end
end
TriggeringContainer -->|Triggers| FileStorage
FileStorage ==> AIWebService
AIWebService ==> AddedDomainObjects & AddedObservables
Case 3 : from an OpenCTI Entity UI with Incident or Threat Actor special sub cases
graph LR
subgraph FileStorage
end
subgraph AIWebService
end
subgraph OpenCTI
direction LR
subgraph TriggeringEntity[Triggering Entity]
TriggeringIncident[Incident]
TriggeringThreatActor[Threat Actor]
TriggeringOthers["..."]
end
subgraph AddedDomainObjects[Added Domain Objects]
OpenCTIIntrusionSet[Intrusion Set]
OpenCTIVulnerability[Vulnerability]
OpenCTIAttackPattern[Attack Pattern]
OpenCTIOthers["..."]
end
subgraph AddedObservables[Added Observables]
end
end
TriggeringEntity -.->|Triggers| FileStorage
FileStorage ==> AIWebService
AIWebService ==> AddedDomainObjects & AddedObservables
AddedObservables -->|Related To| TriggeringEntity
TriggeringIncident-->|Attributed To| OpenCTIIntrusionSet
TriggeringIncident & TriggeringThreatActor -->|Targets| OpenCTIVulnerability
TriggeringIncident & TriggeringThreatActor -->|Uses| OpenCTIAttackPattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_before_import: true | ||
log_level: 'info' | ||
web_service_url: 'http://localhost:PORT' | ||
licence_key_pem:| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
licence_key_pem:| | |
licence_key_pem: | |
Proposed changes
Related issues
Checklist
Further comments