-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shadowserver 2022-01 update #2143
Conversation
I'm not sure if all of this PR was created on purpose. It contains PR#2134 and 6991597 (not as commit, but as a separate change?), but also tons of other changes in the shadowserver parser which I don't understand and some updates to the shadowserver collector, which need to be reviewed. Many updates to the tests, also new test which I appreciate highly! All in all you changes 142 files in just one commit. Can you please split that enormous commit into smaller ones? E.g. the shadowserver collector, update to the parser and test fixes, new tests or so. |
Codecov Report
@@ Coverage Diff @@
## develop #2143 +/- ##
===========================================
+ Coverage 76.30% 76.43% +0.13%
===========================================
Files 441 456 +15
Lines 23649 23937 +288
Branches 3739 3454 -285
===========================================
+ Hits 18046 18297 +251
- Misses 4864 4894 +30
- Partials 739 746 +7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thank you for this PR, mus have been a lot of work!
There lots of changes which I am unable to review properly. Someone with access to the data and the API needs to cover these parts.
Maybe @monoidic or @waldbauer-certat ?
There are a few changes which should be documented in the NEWS file:
- change of the identifier of the blocklist, honeypot feeds, ip spoofer, MS sinkhole, ..... That will cause troubles for the users and should therefore be documented very well, e.g. with a table
- And I don't see a good reason to change "open" and "vulnerable" to "scan". "Scan" describes they way how the information was retrieved, but not what the event is about. The IoCs do not describe, that the address do scanning, but that they are vulnerable.
- change of
extra.naics
->extra.source.naics
From the formal side a summary of the changes is necessary to communicate to the users what they need to adapt in their workflows.
intelmq/tests/bots/parsers/shadowserver/test_event4_sinkhole_http_referer.py
Show resolved
Hide resolved
intelmq/tests/bots/parsers/shadowserver/testdata/scan_ssl_freak.csv
Outdated
Show resolved
Hide resolved
intelmq/tests/bots/parsers/shadowserver/testdata/scan_ssl_poodle.csv
Outdated
Show resolved
Hide resolved
I made a number of changes to address your comments. Please let me know if further changes are required. |
I have been trying to go through this massive PR and tbh I got interrupted frequently due to work. It's a bit ... huge. |
document the old parameter `country` and its status warn if used adapt the test
Are the incompatible configurations the changes to just the classification.identifier or does it include changes to the classification.type also? |
You are mixing two topics here. One is the runtime configuration of IntelMQ. You changed the first column of The other topic is a bigger issue. You have changed various mapped values, which means the resulting, parsed data is then different to what it was before. Until now, we have been very hesitant to do this, as it causes lots of work for the users. They need to check and adapt all processing steps after the shadowserver bots. Not only the bot's configurations (e.g. filtering, rules, notifications, ...), but also any further processing scripts outside the IntelMQ Core (and many users have a lot of them) and including changes to their databases to have consistent values for statistics etc. That's a huge effort for everyone. To at least aid in this last step, we have provided UPDATE-statements in the past. Just have a look a the NEWS file, you'll find some examples there. |
I am working on an revision that will set the classification.identifier to match the current Once that is done, the only substantial change to the mapped values impacts the configurations where extra.naics, extra.sector, and extra.sic were used instead of extra.source.naics, extra.source.sector, and extra.source.sic. This change should make it easier for users as the fields are consistent across all report types. I suppose another option would be to configure both the non-source and source versions for those three fields. Please let me know your preference. |
I was not sure what release version to use in I think the only outstanding issue is how to handle the source.naics/sic/sector changes. |
I completed the compatibility updates. Here is a diff showing the changes for each feed: I believe this PR is ready to be merged. Please let me know if you have any suggestions or concerns. |
According to @aaronkaplan, @elsif2 himself is responsible for this PR. |
Update to existing test cases to match current report types. New tests for added report types. pycodestyle fixes add testdata licenses pycodestyle fix Added reports parameter Suggested changes to the parser Proposed details for the release Test script updates for suggested changes Test input updates Realign columns Update compromised_website.csv Update scan_adb.csv Update scan_adb.csv Update scan_ftp.csv Update scan_ipp.csv Update scan_snmp.csv Realign columns Remove duplicates Changed malware.name to extra.infection Updated SPDX-FileCopyrightText shadowserver api: document and warn on old parameter document the old parameter `country` and its status warn if used adapt the test DOC: fix NEWS entry of PR#2143 Added the sector field to scan_amqp, scan_cwmp, and scan_vnc. Copyright and raw field updates Added the sector field to scan_amqp, scan_cwmp, and scan_vnc. Copyright updates Added phish_url and scan_modbus reports. Update source.url and source.fqdn for phish_url and malware_url reports. Update classification.taxonomy and classification.type for scan_modbus report. * additional field type validation changes * added count, bytes, duration, avg_pps, and max_pps fields to event_honeypot_ddos_amp * added 'protocol.application': 'https' to scan_ssl, scan_ssl_freak, and scan_ssl_poodle * added 'extra.tag' to scan_* and device_id Replaced scan_modbus with scan_ics Addeed event4_honeypot_ddos, event4_honeypot_ddos_target, scan_dvr_dhcpdiscover, and scan_socks. Tests for event4_honeypot_ddos. Tests for event4_honeypot_ddos_target. Tests for scan_dvr_dhcpdiscover. Tests for scan_socks. Rename file Rename file update:scan_mdns, scan_smb, and special; add:scan_ddos_middle_box cleanup renamed license files updated scan_mdns test files updated scan_smb test files updated special test files add scan_ddos_middlebox test files add scan_ddos_middlebox test updated schema Updated scan_smb tests Updated scan_ntp tests Updated scan_snmp tests New scan_docker test New scan_kubernetes test New scan_mysql test Updated report schema for June 2022 Added scan_epmd test Revert "Added scan_epmd test" This reverts commit 01edea1. Revert: Fix for recover_line method as commited in #2192 Added scan_couchdb Test case for scan_couchdb Added scan6_rpd Added/updated README with maintainer details Restored feed names and classification.identifiers to minimize upgrade impact. Merge repair pycodestyle repairs codespell fixes license compliance fixes pycodestyle fixes Feed configuration updates for compatibility with the original. Added scan_postgres test Added additional IPv6 aliases Fix for recover_line method as commited in #2192
I am not authorized to merge a pull request. I created a clean branch for this PR as shadowserver-202207. Please advise on how you would like me to proceed. |
I'm caught between two stools, as I don't know the details of your agreement with @aaronkaplan. Can you please check with him? I'd like to do a release of IntelMQ (3.1.0), and the shadowserver updates would be the major feature of the new release. |
Replaced by #2227 |
This commit is to update the Shadowserver bots to support all current report types.
Test Results