-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sieve Filter expert #1083
Sieve Filter expert #1083
Conversation
…nd forwards message.
I fixed the codestyle checks in https://github.com/antoinet/intelmq/commit/17c0d4d1d70903ff08f00a33ab6cb1a3276dc220 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issues pointed out by the tests:
- don't use tabs in BOTS
- fix the codestyle:
- intelmq/bots/experts/sieve/expert.py:222:1: E305 expected 2 blank lines after class or function definition, found 1
- intelmq/bots/experts/sieve/validator.py:46:1: E305 expected 2 blank lines after class or function definition, found 1
- intelmq/bots/experts/sieve/validator.py:54:36: W292 no newline at end of file
- do not "rely" on the textx package during the module loading, have a look at e.g. the asn lookup: https://github.com/certtools/intelmq/blob/master/intelmq/bots/experts/asn_lookup/expert.py#L6 and https://github.com/certtools/intelmq/blob/master/intelmq/bots/experts/asn_lookup/expert.py#L15
- add a
test.skip_exotic()
decorator, so the textx won't be required for a test run if not explicitly requested like here: https://github.com/certtools/intelmq/blob/master/intelmq/tests/bots/experts/asn_lookup/test_expert.py#L42 - ipaddress is already in cpython >= 3.3, we do not support anything lower, so remove this requirement
Will look at the code once these issues are resolvend.
@antoinet Thanks. If you push to antoinet:develop, the commits are part of this PR |
See: #1083 (review) * removed tabs in BOTS * safely import textx package * added `skip_exotic` decorator to sievebot test class * removed package `ipaddress` from REQUIREMENTS.txt Other changes: * removed `print` statement from expert.py
* Fixed BOTS format * replaced re.fullsearch with re.search (method unavailable in python < 3.4) * removed Enum definition (unavailable in python < 3.4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failing tests for cymru can be ignored
Looks good overall.
I guess there is some room for optimization as some rules are evaulated every time.
See also https://github.com/antoinet/intelmq/pull/15 fixing some of the issues.
I am not very happy about just another configuration language in intelmq but I don't have better solution currently.
intelmq/bots/experts/sieve/expert.py
Outdated
Parameters: | ||
file: string | ||
""" | ||
from __future__ import unicode_literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support python3 only, this is obsolete. Also, you may find isort useful.
intelmq/bots/BOTS
Outdated
@@ -572,6 +572,13 @@ | |||
"module": "intelmq.bots.experts.taxonomy.expert", | |||
"parameters": {} | |||
}, | |||
"Sievebot": { | |||
"description": "Sievebot is the bot responsible to filter and modify intelMQ events.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please be more verbose, say that the sieve language is used. E.g. 'This bot filters and modifies events based on a sieve-based language.'
intelmq/bots/experts/sieve/expert.py
Outdated
from intelmq.lib.bot import Bot | ||
|
||
try: | ||
import textx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imported but unused. You can shorten it by doing:
try:
from textx.metamodel import metamodel_from_file
from textx.exceptions import TextXError, TextXSemanticError
except ImportError:
metamodel_from_file = None
and then
if metamodel_from_file is None:
raise ...
intelmq/bots/experts/sieve/expert.py
Outdated
except TextXError as e: | ||
self.logger.error('Could not process sieve grammar file. Error in (%d, %d).', e.line, e.col) | ||
self.logger.error(str(e)) | ||
self.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of these three lines simply do raise ValueError('...')
, everything else is handled by the Bot-class.
@@ -0,0 +1,2 @@ | |||
*.dot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because they might have used graphviz :) Anyway, yes. Not needed here.
intelmq/bots/experts/sieve/expert.py
Outdated
return self.process_ip_range_match(match.key, match.range, event) | ||
elif match.__class__.__name__ == 'Expression': | ||
return self.match_expression(match, event) | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary
|
||
try: | ||
addr = ipaddress.ip_address(event[key]) | ||
except ValueError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this happen in your tests? It should never, actually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the event type is not enforced (any key can be specified as the left hand side of the <<
operator), the value is validated here. We did discuss whether or not to restrict this operator to keys with a type corresponding to an IP address (https://github.com/antoinet/intelmq/issues/11), and I just realized I didn't implement it the suggested way.
intelmq/bots/BOTS
Outdated
@@ -572,6 +572,13 @@ | |||
"module": "intelmq.bots.experts.taxonomy.expert", | |||
"parameters": {} | |||
}, | |||
"Sievebot": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bot is called sieve everywhere else.
@@ -0,0 +1,3 @@ | |||
if comment == 'add field' { | |||
add destination.ip="150.50.50.10" | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing newlines everywhere
intelmq/bots/experts/sieve/expert.py
Outdated
return False | ||
|
||
def process_numeric_operator(self, lhs, op, rhs): | ||
return eval(str(lhs) + op + str(rhs)) # TODO graceful error handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible without eval? At least a strict checking of the arguments should be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Fix some issues in sieve expert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had quite a few comments after more careful review. but overall this is a fantastic addition!
Most welcome enhancement would be to be able to specify the output pipeline with this sieve filter.
Then you could do a type of switch-case statement where the data should be flowing to.
@@ -0,0 +1,2 @@ | |||
*.dot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because they might have used graphviz :) Anyway, yes. Not needed here.
rule's conditions, the corresponding actions are performed. Actions can specify | ||
whether the event should be kept or dropped in the pipeline (filtering actions) | ||
or if keys and values should be changed (modification actions). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to specify a specific outgoing pipeline as an action?
Example:
if $condition
then
send-to "pipeline-A";
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not supported by the core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our instance, we are already (ab)using a newly introduced key called keep
(boolean), which can be set anywhere in the pipeline. Downstream, events with keep==False
are dropped.
This can be generalized by introducing a new key, e.g. route
, with values that describe a specific destination. But this requires to put filter bots upstream of every destination such that only the corresponding events are kept.
If a routing model was provided in the core, we would certainly have use-cases for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why no issue existed for this which and created #1088 for it
intelmq/bots/experts/sieve/README.md
Outdated
## Parameters | ||
|
||
The sieve bot takes only one parameter: | ||
* `file` - filesystem path the the sieve file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grammar: file system path of the sieve file
intelmq/bots/experts/sieve/README.md
Outdated
drop // aborts processing of subsequent rules and drops the event. | ||
} | ||
|
||
if source.ip << 192.0.0.0/24 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to the docs below, IP addresses must be quoted.
|
||
```if feed.name != 'acme-security' || feed.accuracy == 100 { ... }``` | ||
|
||
* `:contains` matches on substrings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wish list: if we have contains, out of reasons of symmetry , may we have :notcontains
?
|
||
* `keep` marks the event to be forwarded to the next bot in the pipeline | ||
(same as the default behaviour), but in addition the sieve file processing is | ||
interrupted upon reaching this action. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So implicitly here you are saying that the default behaviour without keep
is to contineue with the matching ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the keep
and drop
actions explicitly interrupt the processing of the rules.
Without a mention of keep
or drop
, processing of subsequent rules continues. Afaik this is the same behavior as mail sieve.
|
||
### Comments | ||
|
||
Comments may be used in the sieve file: all characters after `//` and until the end of the line will be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wish-list: //
or #
as a comment character.
|
||
Use the following command to validate your sieve files: | ||
``` | ||
$ intelmq.bots.experts.sieve.validator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be called implicitly when the intelmqctl configtest
gets called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intelmqctl check
intelmq/bots/experts/sieve/expert.py
Outdated
return False | ||
|
||
def process_numeric_operator(self, lhs, op, rhs): | ||
return eval(str(lhs) + op + str(rhs)) # TODO graceful error handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -0,0 +1,3 @@ | |||
if source.ip << ['192.0.0.0/24', '169.254.0.0/16'] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not seem to be documented in the docu. I mean that the <<
operator also works on lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, every operator works on lists. I will point that out.
I move that we merge this soon! |
- Remove .gitignore for .dot files - Typo and erroneous syntax in README.md - codestyle in validator.py
Codecov Report
@@ Coverage Diff @@
## develop #1083 +/- ##
===========================================
+ Coverage 73.93% 76.31% +2.38%
===========================================
Files 217 220 +3
Lines 9417 10338 +921
Branches 1283 1363 +80
===========================================
+ Hits 6962 7889 +927
+ Misses 2176 2142 -34
- Partials 279 307 +28
|
@wagner-certat and @aaronkaplan: I think I addressed all review points so far up to the
|
Concerning the If we one day support multiple source IP addresses or similar (with the IDEA format or whatever) everything blows up anyway... So I am in favour of just using the currently implemented behaviour with @aaronkaplan What do you think? |
I understand, its well pointed @wagner-certat . However, I think we are far way from that day and also, the day we change to multi source IP addresses we need to change multiple experts bots. IMHO, we should keep with our principle of using one key = value, therefore, the presented code is fine. |
This is for the Sievebot which was written in collaboration with @antoinet and @helderfernandes1279.
The sieve bot is used to filter and/or modify events based on a set of rules. The rules are specified in an external configuration file and with a syntax similar to the Sieve language used for mail filtering.
Each rule defines a set of matching conditions on received events. Events can be matched based on keys and values in the event. If the processed event matches a rule's conditions, the corresponding actions are performed. Actions can specify whether the event should be kept or dropped in the pipeline (filtering actions) or if keys and values should be changed (modification actions).
For more information, see README.md under "intelmq/bots/experts/sieve/README.md"