Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeouts for some collectors using "requests" #859

Merged
16 commits merged into from
May 29, 2017

Conversation

dmth
Copy link
Contributor

@dmth dmth commented Jan 25, 2017

This PR adds the capability of setting a timeout to the collector_http and collector_mail_url.

If a timeout occurs, the connection is tested again for two times. If the request fails three time in a row, the collector will start processing again, when the rate_limit interval was exceeded.

This PR is in conflict with #835 (comment)
This PR fixes #854 and #666

Dustin Demuth and others added 4 commits January 23, 2017 15:48
Signed-off-by: Sebastian Wagner <[email protected]>

Conflicts:
	docs/Bots.md
	intelmq/bots/collectors/http/collector_http.py
	intelmq/bots/collectors/mail/collector_mail_url.py
@codecov-io
Copy link

codecov-io commented Jan 26, 2017

Codecov Report

Merging #859 into master will decrease coverage by 0.23%.
The diff coverage is 11.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #859      +/-   ##
==========================================
- Coverage   78.21%   77.98%   -0.24%     
==========================================
  Files         221      221              
  Lines        9015     9047      +32     
==========================================
+ Hits         7051     7055       +4     
- Misses       1964     1992      +28
Impacted Files Coverage Δ
...telmq/bots/experts/ripencc_abuse_contact/expert.py 88.88% <ø> (ø) ⬆️
...elmq/bots/collectors/http/collector_http_stream.py 30% <ø> (ø) ⬆️
intelmq/bots/outputs/restapi/output.py 91.3% <ø> (ø) ⬆️
intelmq/bots/collectors/rt/collector_rt.py 15.21% <ø> (ø) ⬆️
intelmq/bots/collectors/mail/collector_mail_url.py 20% <0%> (-6.83%) ⬇️
intelmq/bots/collectors/http/collector_http.py 23.68% <0%> (-8.46%) ⬇️
intelmq/lib/bot.py 69.33% <44.44%> (-0.6%) ⬇️
intelmq/bots/experts/cymru_whois/lib.py 87.23% <0%> (+1.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b8af21d...9ba056e. Read the comment docs.

ghost
ghost previously requested changes Jan 26, 2017
timeoutretries += 1
self.logger.warn("Timeout whilst downloading the report.")

if timeoutretries >= 3:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have this configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be reasonable.

# The download timed out too often, leave the Loop.
continue

self.logger.debug(resp.content)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this left from debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uuuuh yes.

@ghost ghost requested a review from SYNchroACK January 26, 2017 10:26
@ghost ghost dismissed their stale review January 26, 2017 13:01

fixed

@dmth
Copy link
Contributor Author

dmth commented Jan 26, 2017

To Merge this PR intelmq/bots/collectors/http/collector_http_stream.py and intelmq/bots/collectors/rt/collector_rt.py need to be adapted to the new parameters

@ghost ghost added this to the v1.0 Stable Release milestone Jan 30, 2017
@ghost ghost self-assigned this Jan 30, 2017
ghost
ghost previously requested changes Jan 30, 2017
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http_timeout_max_tries is not documented
no changelog entry

timeoutretries = 0
resp = None

while timeoutretries < self.http_timeout_max_tries and resp is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wagner-certat can you validate this with me:

Is the Error Handling features documented here capable to do the same thing as this pull request want to do? I'm not 100% sure but it seems the same.

Copy link

@ghost ghost Jan 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, except that this is specific to timeouts. However, as the error_max_retries can be set to 3 for theses both collectors, I actually think that keeping the number of configuration parameters low is more important than handling the errors differently. So maybe just let the request.get raise the timeouterror?

Copy link
Member

@aaronkaplan aaronkaplan Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I agree that it's good practice to keep the number of config parameters low. if possible.

Copy link
Contributor

@SYNchroACK SYNchroACK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmth to fix the timeout issue there is only need to add 3 lines as far I can see:

try:
    resp = requests.get(url=self.parameters.http_url, auth=self.auth,
                        proxies=self.proxy, headers=self.http_header,
                        verify=self.http_verify_cert,
                        cert=self.ssl_client_cert,
                        timeout=self.http_timeout)
except requests.exceptions.Timeout:
    raise requests.exceptions.Timeout('Timeout whilst downloading the report.')

as a User, you go to configuration and change as you want according to this Error Handling documentation

Let me know if there is anything that I miss on my analysis of the problem. Thank you for raising this one. :)

@ghost
Copy link

ghost commented Jan 30, 2017

to fix the timeout issue there is only need to add 3 lines as far I can see:

Why do you catch and raise the exact same error at all?

@SYNchroACK
Copy link
Contributor

Why do you catch and raise the exact same error at all?

:D:D:D:D heh I only added that just to have a specific message, if it's important but in practice, I agree with you @wagner-certat , there is no need to fix for that

@dmth
Copy link
Contributor Author

dmth commented Jan 31, 2017

From my point of view it's better to have an explicit timeout behaviour than using the built in error handling.
error_max_retries depicts a behaviour which is occurring when the bot fails due to a most likely internal error or because some data-format has changed.
http_timeout_max_tries on the other hand clearly states a behaviour which only occurs when talking to an external service.

Based on my experience users, and administrators as well, might want to differentiate this behaviour.

What might be a step to simplify configuration?
If http_timeout_max_tries is not set, use error_max_retries as fallback.

BTW: The Guide https://github.com/certtools/intelmq/blob/master/docs/User-Guide.md#error-handling is buggy:

error_max_retries - in case of an error and the value of the error_procedure option is retry

But: error_procedure does only list stop and pass

ghost pushed a commit that referenced this pull request Jan 31, 2017
there's not retry mode, that's standard behavior

Thanks @dmth
#859 (comment)
related: #859

Signed-off-by: Sebastian Wagner <[email protected]>
@ghost
Copy link

ghost commented Jan 31, 2017

BTW: The Guide https://github.com/certtools/intelmq/blob/master/docs/User-Guide.md#error-handling is buggy:

error_max_retries - in case of an error and the value of the error_procedure option is retry

But: error_procedure does only list stop and pass

Thanks, fixed: 1f51da9

@swilde
Copy link
Contributor

swilde commented Feb 3, 2017

FWIW I'm with @dmth here:
problems preventing a download from finishing are quite common and often related to infrastructure not under control of the intelmq-admin. Handling of such situations should be configurable with out much side effects. Networking related errors are always something to expect, in contrast I'd consider error_max_retries a configuration option for handling of less expected errors.

@ghost ghost added component: bots feature Indicates new feature requests or new features needs: discussion labels Feb 15, 2017
Dustin Demuth added 2 commits March 1, 2017 11:58
@dmth
Copy link
Contributor Author

dmth commented Mar 1, 2017

I've upgraded the documentation for the http_max_retries parameter. In addition I renamed the parameter for the stream collector in order to use the same name.
However, the stream collector does not honor the max_retries parameter.
This should satisfy #859 (review)

@ghost ghost requested a review from aaronkaplan March 1, 2017 11:47
@ghost ghost dismissed their stale review March 1, 2017 11:48

updated code

@ghost
Copy link

ghost commented Mar 1, 2017

This PR now proposes a mix of http_timeout and http_timeout_sec:

> grep http_timeout * -r
bots/collectors/http/collector_http_stream.py:http_timeout: tuple of two floats or float
bots/collectors/http/collector_http_stream.py:                               timeout=self.http_timeout)
bots/collectors/http/collector_http.py:http_timeout: tuple of two floats or float
bots/collectors/http/collector_http.py:                                    timeout=self.http_timeout_sec)
Binary file bots/collectors/http/__pycache__/collector_http.cpython-34.pyc matches
Binary file bots/collectors/http/__pycache__/collector_http_stream.cpython-34.pyc matches
Binary file bots/collectors/rt/__pycache__/collector_rt.cpython-34.pyc matches
bots/collectors/rt/collector_rt.py:                                    timeout=self.http_timeout)
Binary file bots/collectors/mail/__pycache__/collector_mail_url.cpython-34.pyc matches
bots/collectors/mail/collector_mail_url.py:                                                    timeout=self.http_timeout_sec)
etc/defaults.conf:    "http_timeout_sec": 30,
lib/bot.py:        self.http_timeout_sec = getattr(self.parameters, 'http_timeout_sec', None)

more clear parameter http_timeout_sec instead of
http_timeout
@dmth
Copy link
Contributor Author

dmth commented Mar 1, 2017

This PR now proposes a mix of http_timeout and http_timeout_sec:

The mix is not intentional. I updated the RT collector and documentation correspondingly.

@dmth
Copy link
Contributor Author

dmth commented Mar 22, 2017

@aaronkaplan I really wonder why this cannot be merged...

  1. It provides a more clear naming for the timeout parameters (http_timeout_sec instead of http_timeout)
  2. It enables an administrator to control how often downloads are retried (http_timeout_max_tries)
  3. Experienced administrators told us, this is a reasonable way to go. (Timeouts for some collectors using "requests" #859 (comment))

From reviewing the latest commits, it looks like intelmq/bots/experts/ripencc_abuse_contact/expert.py (a1a09ac) and intelmq/bots/outputs/restapi/output.py (a918ec7) would also require such renaming now. Right now I'm not sure if I want to keep this PR up to date. It's a little bit wearing....

grep -r http_timeout **/*.py 
intelmq/bots/collectors/http/collector_http.py:    http_timeout_sec: tuple of two floats or float
intelmq/bots/collectors/http/collector_http.py:    http_timeout_max_tries: an integer depicting how often a connection attempt is retried
intelmq/bots/collectors/http/collector_http.py:        while timeoutretries < self.http_timeout_max_tries and resp is None:
intelmq/bots/collectors/http/collector_http.py:                                    timeout=self.http_timeout_sec)
intelmq/bots/collectors/http/collector_http.py:        if resp is None and timeoutretries >= self.http_timeout_max_tries:
intelmq/bots/collectors/http/collector_http_stream.py:    http_timeout_sec: tuple of two floats or float
intelmq/bots/collectors/http/collector_http_stream.py:                               timeout=self.http_timeout_sec)
intelmq/bots/collectors/mail/collector_mail_url.py:                        while timeoutretries < self.http_timeout_max_tries and resp is None:
intelmq/bots/collectors/mail/collector_mail_url.py:                                                    timeout=self.http_timeout_sec)
intelmq/bots/collectors/mail/collector_mail_url.py:                        if resp is None and timeoutretries >= self.http_timeout_max_tries:
intelmq/bots/collectors/rt/collector_rt.py:                                    timeout=self.http_timeout_sec)
intelmq/bots/experts/ripencc_abuse_contact/expert.py:                                timeout=self.http_timeout)
intelmq/bots/experts/ripencc_abuse_contact/expert.py:                                timeout=self.http_timeout)
intelmq/bots/experts/ripencc_abuse_contact/expert.py:                                timeout=self.http_timeout)
intelmq/bots/outputs/restapi/output.py:                              timeout=self.http_timeout,
intelmq/lib/bot.py:        self.http_timeout_sec = getattr(self.parameters, 'http_timeout_sec', None)
intelmq/lib/bot.py:        self.http_timeout_max_tries = getattr(self.parameters, 'http_timeout_max_tries', 1)
intelmq/lib/bot.py:        self.http_timeout_max_tries = self.http_timeout_max_tries if self.http_timeout_max_tries >= 1 else 1

@ghost ghost removed the needs: discussion label May 15, 2017
@ghost
Copy link

ghost commented May 15, 2017

This PR fixes #854 and #666

This PR will not fix #666

@ghost
Copy link

ghost commented May 15, 2017

Would be great if someone from @Intevation could check my changes. I'll then merge it.

@ghost ghost force-pushed the dev-collector-timeout branch from 55535ff to 002654a Compare May 15, 2017 15:24
@ghost ghost force-pushed the dev-collector-timeout branch from 002654a to 9ba056e Compare May 15, 2017 15:24
@dmth
Copy link
Contributor Author

dmth commented May 24, 2017

Would be great if someone from @Intevation could check my changes. I'll then merge it.

Thanks for enhancing the code. Looks good, please go ahead.

@ghost ghost merged commit 9ba056e into certtools:master May 29, 2017
@dmth dmth deleted the dev-collector-timeout branch June 29, 2017 13:21
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: bots feature Indicates new feature requests or new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants