-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added parent domain field to source, destination and url. #531
Conversation
The parent domain field is the domain without any sub-domain. The domain field is exact-match, which means it is not possible to search for all connections to a domain when a sub-domain is involved. The field will allow users to store normalized domains using the public suffix list.
I made a feature request to add a TLD extract filter to Logstash. |
@mbudge Thanks for creating this PR. I think there is wide agreement in the community for adding a field to capture the parent/higher level/registered domain. However, there is not agreement about what to call it :-). Please see Issue #84 for a further discussion. There is some support (#84 (comment)) for creating a field I have three suggestions:
|
@mbudge A related question please: Would you also propose to add related ECS fields for the subdomain portion of the domain? (i.e. the part left over after extracting the parent/higher level/registered part) Previous sentiment on adding such fields has been mixed, with some votes against it (#84 (comment)) but would love your thoughts. |
Thanks for submitting this PR! I responded to you over on #84 before seeing this pull request :-) I would suggest only adding the new field for the parent / registered domain in this PR, for now. I'm not sure extracting "only the subdomain" to another field is as valuable. In the list of possible place where we need the new field, I would actually exclude those under Since we already went with the name
|
It's true, parent child traverses the DNS hierarchy. Submitted a new pull request to add registered_domain. |
Great, thanks for creating the new PR! |
The parent domain field is the domain without any sub-domain. The domain field is a exact-match keyword field, which means it is not possible to search for all connections to a domain when a sub-domain is involved. The parent domain field will allow users to store normalized domains using the public suffix list.
For example, the registered domain for "foo.malware.com" is "malware.com".
This value can be determined precisely with a list like the public suffix list (http://publicsuffix.org). Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "co.uk". If the parent domain normalization process fails, users should store with original domain with the sub-domain in the parent-domain field. Punny-code domains are more likely to fail when using common TLD extract libraries which using the public suffix list to get the parent domain, hence using a best-effort approach means users can still search one field to more accurately find network connections.
This is important in SIEM and log management functions, as users need to be able to find all logs when they are searching for a known bad IOC domain. Users could index domains into an extra text field in their schema, but this is slow and expensive when searching many TB's of data in Elasticsearch.