-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
try loading trusted certs from a list of fallbacks #633
Conversation
src/OpenSSL/SSL.py
Outdated
""" | ||
for cafile in _CERTIFICATE_FILE_LOCATIONS: | ||
if os.path.isfile(cafile): | ||
self.load_verify_locations(cafile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the file exists but is invalid this will currently raise an exception. Do we want to catch this and silently swallow it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welcome to the catch-22 of this approach.
If you throw an exception then users on, say, Gentoo, are going to be real mad that they can no longer use the system trust store just because they put some random data at "/etc/pki/tls/cacert.pem"
, a path that has no special meaning on their system. If you don't throw an exception then you're silently swallowing potentially important errors. Maybe log?
Fun follow-on: should you load all certs at all these places? Does this open the user up to risks about having unexpected or malicious root certs elsewhere in the system suddenly be trusted by Python but nothing else? Doesn't this lead to PyOpenSSL behaving fundamentally differently to their native applications? Does this fundamentally mean that we should all abandon PyOpenSSL and just use PEP 543 instead? ARE WE NOT ALL DOOMED TO DIE ALONE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the approach, too. I could be persuaded if you detect the distribution first (e.g. by parsing /etc/os-release
and using ID
and ID_LIKE
, https://www.freedesktop.org/software/systemd/man/os-release.html) and only handle distribution specific paths for cafile
and capath
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa these are all system directories where I wouldn't expect users to be dropping random files very often. And, if they do, the same problem applies to Go, which honestly seems like a reasonable defense.
Maybe raising a UserWarning if a file is found but is invalid? Or we could continue the loop and look for something else.
@tiran It's impractical to enumerate every possible distribution. What makes you think Go's approach doesn't work? It's not like Go binaries are unusual things and every TLS client connection made by one is doing trust roots in this fashion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go does more work to find the actual location. It takes the env vars into account and stops looking for certs when it finds a directory that contains at least on cert. https://github.com/golang/go/blob/master/src/crypto/x509/root_unix.go#L63
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe related, I do something similar to in Portable PyPy which ships OpenSSL as well: https://github.com/squeaky-pl/portable-pypy/blob/master/ssl3.py.patch
30.000 downloads after this change and nobody complained, which of course doesn't mean it's a right thing to do.
/cc @dstufft |
src/OpenSSL/SSL.py
Outdated
@@ -701,6 +711,35 @@ def set_default_verify_paths(self): | |||
""" | |||
set_result = _lib.SSL_CTX_set_default_verify_paths(self._context) | |||
_openssl_assert(set_result == 1) | |||
num = self._check_num_store_objects() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SSL_CTX
uses lazy loading for capath
directory. The check will always return 0 when a system has only capath
and not a cafile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's definitely a problem!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to detect capath
yourself. https://github.com/python/cpython/blob/master/Lib/ssl.py#L335 take the env vars into account. The four function return the hard-coded paths and the names of the env vars:
X509_get_default_cert_file_env()
X509_get_default_cert_file()
X509_get_default_cert_dir_env()
X509_get_default_cert_dir()
The names of the files in capath
match the pattern [0-9a-f]{8}\.[0-9]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The env var behavior is still present and functional because we call SSL_CTX_set_default_verify_paths
before we try the fallback, but we need to fix the lazy loading issue.
Do you know of a way to force the X509_STORE to load the certs? That seems like an easier path (if possible) than trying to replicate the logic.
src/OpenSSL/SSL.py
Outdated
""" | ||
for cafile in _CERTIFICATE_FILE_LOCATIONS: | ||
if os.path.isfile(cafile): | ||
self.load_verify_locations(cafile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the approach, too. I could be persuaded if you detect the distribution first (e.g. by parsing /etc/os-release
and using ID
and ID_LIKE
, https://www.freedesktop.org/software/systemd/man/os-release.html) and only handle distribution specific paths for cafile
and capath
.
So I tried this with pip. Some of the problems there won't apply here, because you're able to pass both a cafile and a capath (which we couldn't, because requests won't allow it) but some are still going to matter:
Honestly, I suggest just depending on certifi. |
My personal suggestion is that we hew as closely to what go does as possible: https://github.com/golang/go/blob/master/src/crypto/x509/root_unix.go (with the addition of trying whatever's compiled into OpenSSL first) and if that doesn't work it's ths users fault for having a weirdass computer. |
(Notably what Go is doing does not match what we're doing here) |
@dstufft do you remember what systems only had CAPath? And wouldn't the CAPath and CAFile requirement be an OpenSSL specific bug? We're shipping our own OpenSSL so bugs like that wouldn't affect us. |
We don't need to care about non-linux because we won't be shipping a wheel that causes this problem on those platforms. |
It looks like
are potentially relevant from Go's list. |
On Fedora and RHEL, the |
When I did a similar thing for Portable PyPy I found https://www.happyassassin.net/2015/01/12/a-note-about-ssltls-trusted-certificate-stores-and-platforms/ very helpful. |
Codecov Report
@@ Coverage Diff @@
## master #633 +/- ##
==========================================
+ Coverage 96.78% 97.02% +0.24%
==========================================
Files 18 18
Lines 5625 6084 +459
Branches 390 497 +107
==========================================
+ Hits 5444 5903 +459
- Misses 121 122 +1
+ Partials 60 59 -1
Continue to review full report at Codecov.
|
Thanks for the info @squeaky-pl ! |
d014394
to
72f2ab9
Compare
pyca/cryptography will shortly begin shipping a wheel. Since SSL_CTX_set_default_verify_paths uses a hardcoded path compiled into the library, this will start failing to load the proper certificates for users on many linux distributions. To avoid this we can use the Go solution of iterating over a list of potential candidates and loading it when found.
This now checks to see if env vars are set as well as seeing if the dir exists and has valid certs in it. If either of those are true (or the number of certs is > 0) it won't load the fallback. If it does do the fallback it will also attempt to load certs from a dir as a final fallback
72f2ab9
to
5d9d7fe
Compare
5d9d7fe
to
4e1e7cc
Compare
@@ -130,6 +132,19 @@ class _buffer(object): | |||
SSL_CB_HANDSHAKE_START = _lib.SSL_CB_HANDSHAKE_START | |||
SSL_CB_HANDSHAKE_DONE = _lib.SSL_CB_HANDSHAKE_DONE | |||
|
|||
# Taken from https://golang.org/src/crypto/x509/root_linux.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about the BSD, Plan9 (lol), or Solaris values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't ship a precompiled binary for those platforms so we shouldn't need to care.
src/OpenSSL/SSL.py
Outdated
file_env_var = _ffi.string( | ||
_lib.X509_get_default_cert_file_env() | ||
).decode("ascii") | ||
if not self._verify_env_vars_set(dir_env_var, file_env_var): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please name this check_env_vars_set
.
src/OpenSSL/SSL.py
Outdated
@@ -699,8 +714,98 @@ def set_default_verify_paths(self): | |||
|
|||
:return: None | |||
""" | |||
# This function will attempt to load certs from both a cafile and | |||
# capath that are set at compile time. However, it will first check | |||
# environment variables and, if present, load those paths instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment feels like it's in the wrong place. It checks the default paths before manualyl doing the env vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SSL_CTX_set_default_verify_paths
itself checks env vars. It's part of OpenSSL itself. The logic here should look like this:
- Attempt to load via
SSL_CTX_set_default_verify_paths
. This will use the env vars specified by OpenSSL preferentially, but if they're not set (as they are not in most cases) then it will load the CA file and CA dir that were set at compile time. - If env vars are set we do not attempt to load any fallbacks, but if they are not we go to the fallback path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, I see the ambiguity, "This function" refers to SSL_CTX_set_default_verify_paths
but I thought it meant "the function we are in". Can you change the comment?
src/OpenSSL/SSL.py
Outdated
# objects are present. However, the cert directory (capath) is | ||
# lazily loaded and num will always be zero so we need to check if | ||
# the dir exists and has valid file names in it to cover that case. | ||
num = self._check_num_store_objects() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you load your own roots and then call set_default? What should happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior to this patch it would load the system trust store properly even if you already had certs added. With this patch it would fail to do so. That is not what we want. Ideas? We could potentially track calls to load_verify_locations
and run the fallback if it hasn't been called previously, but that is pretty ugly.
What if, instead of checking the number of certs in the store, we checked to see if the default cert file path and default cert dir path don't exist. If both are not present then we'd use fallbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Count the number of certs at the start, and instead of checking if the number after calling SSL_CTX_set...
is zero, check if it's the same as at the start. (And add a test for it!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That also works, although I'm wondering if it'd make more sense to just say if the file is there then we shouldn't need fallbacks.
Another idea would be that we could compile our OpenSSL such that the default dir and file is very unique. Like /pyca/cryptography/openssl/cert.pem
. Then our entire fallback check could just be "is the default dir/file this value", because that would be a guard value to tell us we're a manylinux1 build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're saying dump most of this logic and instead check "is this is a pyca/cryptography OpenSSL, then try these known distro paths"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I didn't think of that until just now so I'm not sure I've thought through the implications, but it seems like rather than trying to detect if a system has successfully loaded roots maybe we're better off only modifying our behavior when we know we're running under a manylinux1 cryptography build.
src/OpenSSL/SSL.py
Outdated
:return: bool | ||
""" | ||
return ( | ||
os.environ.get(file_env_var, None) is not None or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop the None
arg to get
src/OpenSSL/SSL.py
Outdated
return any( | ||
[re.match(b'^[0-9a-f]{8}\.[0-9]', x) is not None for x in l] | ||
) | ||
except OSError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the os.listdir
should be wrapped in the except, and only ENOENT
should be caught, anyhting else can be reraised. Maybe EPERM
as well, not sure.
tests/test_ssl.py
Outdated
try: | ||
dir_var = "CUSTOM_DIR_VAR" | ||
file_var = "CUSTOM_FILE_VAR" | ||
os.environ[dir_var] = "value" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use monkeypatch
for fucking with teh env.
👍
…On Tue, Jun 20, 2017 at 9:37 PM, Paul Kehrer ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/OpenSSL/SSL.py
<#633 (comment)>:
> @@ -130,6 +132,19 @@ class _buffer(object):
SSL_CB_HANDSHAKE_START = _lib.SSL_CB_HANDSHAKE_START
SSL_CB_HANDSHAKE_DONE = _lib.SSL_CB_HANDSHAKE_DONE
+# Taken from https://golang.org/src/crypto/x509/root_linux.go
We won't ship a precompiled binary for those platforms so we shouldn't
need to care.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#633 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAADBPc-31X31FQAhOT39hu7nxn8dFyaks5sGHPmgaJpZM4NzkM4>
.
--
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: D1B3 ADC0 E023 8CA6
|
I kind of think that's a reasonable idea. Would like feedback from other
wise people like @dstufft.
…On Tue, Jun 20, 2017 at 9:56 PM, Paul Kehrer ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/OpenSSL/SSL.py
<#633 (comment)>:
> + # First we'll check to see if any env vars have been set. If so,
+ # we won't try to do anything else because the user has set the path
+ # themselves.
+ dir_env_var = _ffi.string(
+ _lib.X509_get_default_cert_dir_env()
+ ).decode("ascii")
+ file_env_var = _ffi.string(
+ _lib.X509_get_default_cert_file_env()
+ ).decode("ascii")
+ if not self._verify_env_vars_set(dir_env_var, file_env_var):
+ # If no env vars are set next we want to see if any certs were
+ # loaded. For a cafile this is simple and we can just ask how many
+ # objects are present. However, the cert directory (capath) is
+ # lazily loaded and num will always be zero so we need to check if
+ # the dir exists and has valid file names in it to cover that case.
+ num = self._check_num_store_objects()
Yeah, I didn't think of that until just now so I'm not sure I've thought
through the implications, but it seems like rather than trying to detect if
a system has successfully loaded roots maybe we're better off only
modifying our behavior when we know we're running under a manylinux1
cryptography build.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#633 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAADBIHEQLuORd5RfPBmkQWM48oFBSD3ks5sGHhFgaJpZM4NzkM4>
.
--
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: D1B3 ADC0 E023 8CA6
|
src/OpenSSL/SSL.py
Outdated
"/etc/ssl/certs", # SLES10/SLES11 | ||
] | ||
|
||
_CRYPTOGRAPHY_MANYLINUX1_CA_DIR = "/pyca/cryptography/openssl/certs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values will need to be set by using --openssldir=/pyca/cryptography/openssl --prefix=/pyca/cryptography/openssl
as args to the configure script in pyca/infra#98
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the point of these values just to act as a sigil to detect if we're inside a manylinux1 wheel or not? Do we expect people to ever put anything inside of these directories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the answer is it's a sigil, we don't expect people to put stuff there, but as an intermediate hack it's maybe handy that someone could put something there in a pinch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we expect people to maybe put something there in a pinch, maybe it should conform to a more standard location like /opt/pyca/cryptograpy/openssl/
or something? Not a big deal either way though, but it feels weird to have a top level /pyca/
path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go ahead and make the change. Hopefully users will never need to put something in that path, but I guess it's not crazy to want the path to be sane so that it could be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Would also like a review from @dstufft or someone else, and I think we should hold off on merging until we finalize the manylinux1 side of this.
pyca/infra#98 has landed and pyca/cryptography#3736 is now in review. This should no longer be blocked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I'd like additional reviews from other folks!
pyca/cryptography will shortly begin shipping a wheel. Since SSL_CTX_set_default_verify_paths uses a hardcoded path compiled into the library, this will start failing to load the proper certificates for users on many linux distributions. To avoid this we can use the Go solution of iterating over a list of potential candidates and loading it when found.
fixes #632
Update: The approach has been modified to use the default cert file and default cert dir to detect whether the installed cryptography is sourced from a manylinux1 wheel. If it is (and the OpenSSL env vars that override default dirs are not set) then we load the fallbacks. This should address most if not all of the previous concerns that have been raised.