Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URLHaus analyzer #271

Merged
merged 7 commits into from
Jul 2, 2018
Merged

Add URLHaus analyzer #271

merged 7 commits into from
Jul 2, 2018

Conversation

3c7
Copy link
Contributor

@3c7 3c7 commented Jun 6, 2018

This is the urlhaus analyzer provided by @ninoseki (#227). It needs some refactoring, because the dependency requests-html seems to need Python 3.6 while Ubuntu 16.04 LTS comes with 3.5. As a large user base uses Ubuntu 16.04, this analyzer should be refactored.

Possible measures:

  • use another module to parse the HTML site
  • use the provided CSV export from URLHaus (favoured)

If you want to participate, @ninoseki, you can base PRs on the feature/urlhaus-analyzer branch, as your PR cannot be reopened, because it was merged before.

@3c7 3c7 added category:enhancement Issue is related to an existing feature to improve scope:analyzer Issue is analyzer related labels Jun 6, 2018
@3c7 3c7 self-assigned this Jun 6, 2018
@3c7 3c7 mentioned this pull request Jun 6, 2018
@ninoseki
Copy link
Contributor

ninoseki commented Jun 8, 2018

I prefer scraping to the CSV export because URLhaus's CSV export doesn't contain hash values.
So I think scraping is better than the CSV export.

@3c7 Does it make sense?
If so I'll implement beautifulsoup ver. of this analyzer.

@3c7
Copy link
Contributor Author

3c7 commented Jun 8, 2018

Hey @ninoseki, there are two csv dumps, one containing the urls and one containing palyoad hashes:

So it would be possible to use both and combine them in the result.

@ninoseki ninoseki mentioned this pull request Jun 9, 2018
@jeromeleonard jeromeleonard added this to the 1.11.0 milestone Jun 26, 2018
@3c7
Copy link
Contributor Author

3c7 commented Jul 2, 2018

Thanks for your contribution @ninoseki. :) I've made some changes to the report to display colored tags instead of text:

screenshot from 2018-07-02 10-08-03

@3c7 3c7 merged commit 562f273 into develop Jul 2, 2018
@dadokkio dadokkio deleted the feature/urlhaus-analyzer branch February 26, 2021 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:enhancement Issue is related to an existing feature to improve scope:analyzer Issue is analyzer related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants