Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[URLhaus] Change of format from URLhaus #308

Closed
srilumpa opened this issue Jul 20, 2018 · 6 comments
Closed

[URLhaus] Change of format from URLhaus #308

srilumpa opened this issue Jul 20, 2018 · 6 comments
Labels
category:bug Issue is related to a bug scope:analyzer Issue is analyzer related
Milestone

Comments

@srilumpa
Copy link
Contributor

Request Type

Bug

Work Environment

Question Answer
OS version (server) Debian
Cortex Analyzer Name URLhaus
Cortex Analyzer Version 1.0
Cortex Version 2.0.4

Description

URLhaus analyzer fails to extract the data from the page queried when analyzing an URL. This is probably due to an update of the URLhaus website which changed its layout.

Steps to Reproduce

  1. Analyze an URL with URLhaus 1.0 analyzer
  2. The analyzer fails

Possible Solutions

Fix the query result parsing.

Complementary information

Here is the analyzer output when analyzing an URL

Invalid output
Traceback (most recent call last):
  File "URLhaus/URLhaus_analyzer.py", line 50, in <module>
    URLhausAnalyzer().run()
  File "URLhaus/URLhaus_analyzer.py", line 23, in run
    'results': self.search(self.get_data())
  File "URLhaus/URLhaus_analyzer.py", line 17, in search
    return URLhaus(indicator).search()
  File "/opt/cortex-analyzers/analyzers/URLhaus/URLhaus.py", line 42, in search
    return self.parse(res)
  File "/opt/cortex-analyzers/analyzers/URLhaus/URLhaus.py", line 52, in parse
    rows = table.find_all("tr")[1:]
AttributeError: 'NoneType' object has no attribute 'find_all'

If I have some time today, I will try to find why it crashes exactly and to fix it

@3c7
Copy link
Contributor

3c7 commented Jul 20, 2018

Hey @ninoseki, do you want to implement the changes as it's "your" analyzer? :)

@3c7 3c7 added category:bug Issue is related to a bug scope:analyzer Issue is analyzer related labels Jul 20, 2018
@3c7 3c7 added this to the 1.12.0 milestone Jul 20, 2018
@ninoseki
Copy link
Contributor

@3c7 Thank you for letting me know. 👍
OK, I'll work on this issue.

@ninoseki
Copy link
Contributor

@srilumpa I confirmed that the DOM structure of URLhaus website is same as before and the analyzer works correctly.
Imgur
So could you tell me the URL you tried to analyze?
(For debugging purpose)

@srilumpa
Copy link
Contributor Author

Sorry, I should have started by this.

A few URL as examples:

  • hxxp://tckkitchen[.]com:80/bst/tasks[.]php
  • hxxp://redirector[.]gvt1[.]com:80/edgedl/release2/chrome/A84nRG1q1w8_67[.]0[.]3396[.]99/67[.]0[.]3396[.]99_chrome_installer[.]exe
  • hxxp://cacerts[.]digicert[.]com:80/DigiCertSHA2HighAssuranceServerCA[.]crt

May be this was only temporary (some sort of service outage): I don't see any errors since Saturday.

@rolinh
Copy link

rolinh commented Jul 23, 2018

May be this was only temporary (some sort of service outage)

Look no further, URLHaus had issues last Friday.

@srilumpa
Copy link
Contributor Author

Sorry for the false alert, then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:bug Issue is related to a bug scope:analyzer Issue is analyzer related
Projects
None yet
Development

No branches or pull requests

4 participants