Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try loading trusted certs from a list of fallbacks #633

Merged
merged 17 commits into from
Jun 29, 2017

Conversation

reaperhulk
Copy link
Member

@reaperhulk reaperhulk commented Jun 8, 2017

pyca/cryptography will shortly begin shipping a wheel. Since SSL_CTX_set_default_verify_paths uses a hardcoded path compiled into the library, this will start failing to load the proper certificates for users on many linux distributions. To avoid this we can use the Go solution of iterating over a list of potential candidates and loading it when found.

fixes #632

Update: The approach has been modified to use the default cert file and default cert dir to detect whether the installed cryptography is sourced from a manylinux1 wheel. If it is (and the OpenSSL env vars that override default dirs are not set) then we load the fallbacks. This should address most if not all of the previous concerns that have been raised.

"""
for cafile in _CERTIFICATE_FILE_LOCATIONS:
if os.path.isfile(cafile):
self.load_verify_locations(cafile)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the file exists but is invalid this will currently raise an exception. Do we want to catch this and silently swallow it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to the catch-22 of this approach.

If you throw an exception then users on, say, Gentoo, are going to be real mad that they can no longer use the system trust store just because they put some random data at "/etc/pki/tls/cacert.pem", a path that has no special meaning on their system. If you don't throw an exception then you're silently swallowing potentially important errors. Maybe log?

Fun follow-on: should you load all certs at all these places? Does this open the user up to risks about having unexpected or malicious root certs elsewhere in the system suddenly be trusted by Python but nothing else? Doesn't this lead to PyOpenSSL behaving fundamentally differently to their native applications? Does this fundamentally mean that we should all abandon PyOpenSSL and just use PEP 543 instead? ARE WE NOT ALL DOOMED TO DIE ALONE?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the approach, too. I could be persuaded if you detect the distribution first (e.g. by parsing /etc/os-release and using ID and ID_LIKE, https://www.freedesktop.org/software/systemd/man/os-release.html) and only handle distribution specific paths for cafile and capath.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa these are all system directories where I wouldn't expect users to be dropping random files very often. And, if they do, the same problem applies to Go, which honestly seems like a reasonable defense.

Maybe raising a UserWarning if a file is found but is invalid? Or we could continue the loop and look for something else.

@tiran It's impractical to enumerate every possible distribution. What makes you think Go's approach doesn't work? It's not like Go binaries are unusual things and every TLS client connection made by one is doing trust roots in this fashion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go does more work to find the actual location. It takes the env vars into account and stops looking for certs when it finds a directory that contains at least on cert. https://github.com/golang/go/blob/master/src/crypto/x509/root_unix.go#L63

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe related, I do something similar to in Portable PyPy which ships OpenSSL as well: https://github.com/squeaky-pl/portable-pypy/blob/master/ssl3.py.patch

30.000 downloads after this change and nobody complained, which of course doesn't mean it's a right thing to do.

@Lukasa
Copy link
Member

Lukasa commented Jun 8, 2017

/cc @dstufft

@Lukasa Lukasa requested a review from dstufft June 8, 2017 08:18
@@ -701,6 +711,35 @@ def set_default_verify_paths(self):
"""
set_result = _lib.SSL_CTX_set_default_verify_paths(self._context)
_openssl_assert(set_result == 1)
num = self._check_num_store_objects()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSL_CTX uses lazy loading for capath directory. The check will always return 0 when a system has only capath and not a cafile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's definitely a problem!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to detect capath yourself. https://github.com/python/cpython/blob/master/Lib/ssl.py#L335 take the env vars into account. The four function return the hard-coded paths and the names of the env vars:

X509_get_default_cert_file_env()
X509_get_default_cert_file()
X509_get_default_cert_dir_env()
X509_get_default_cert_dir()

The names of the files in capath match the pattern [0-9a-f]{8}\.[0-9]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env var behavior is still present and functional because we call SSL_CTX_set_default_verify_paths before we try the fallback, but we need to fix the lazy loading issue.

Do you know of a way to force the X509_STORE to load the certs? That seems like an easier path (if possible) than trying to replicate the logic.

"""
for cafile in _CERTIFICATE_FILE_LOCATIONS:
if os.path.isfile(cafile):
self.load_verify_locations(cafile)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the approach, too. I could be persuaded if you detect the distribution first (e.g. by parsing /etc/os-release and using ID and ID_LIKE, https://www.freedesktop.org/software/systemd/man/os-release.html) and only handle distribution specific paths for cafile and capath.

@dstufft
Copy link
Member

dstufft commented Jun 8, 2017

So I tried this with pip. Some of the problems there won't apply here, because you're able to pass both a cafile and a capath (which we couldn't, because requests won't allow it) but some are still going to matter:

  • Detecting by distribution doesn't work, because there are like a billion distributions.
  • People are going to have random files at locations that other platforms use, if you're lucky these will be symlinks, if you're unlucky they'll be entirely unrelated files or completely unmaintained copies of a ca bundle.
  • Some systems don't use the CAFile at all and they only use a CAPath, it looks like this only looks at CAFile?
  • Some systems use both CAFile and CAPath and expect you to load both of them.

Honestly, I suggest just depending on certifi.

@alex
Copy link
Member

alex commented Jun 8, 2017

My personal suggestion is that we hew as closely to what go does as possible: https://github.com/golang/go/blob/master/src/crypto/x509/root_unix.go (with the addition of trying whatever's compiled into OpenSSL first) and if that doesn't work it's ths users fault for having a weirdass computer.

@dstufft
Copy link
Member

dstufft commented Jun 8, 2017

(Notably what Go is doing does not match what we're doing here)

@reaperhulk
Copy link
Member Author

@dstufft do you remember what systems only had CAPath? And wouldn't the CAPath and CAFile requirement be an OpenSSL specific bug? We're shipping our own OpenSSL so bugs like that wouldn't affect us.

@reaperhulk
Copy link
Member Author

We don't need to care about non-linux because we won't be shipping a wheel that causes this problem on those platforms.

@reaperhulk
Copy link
Member Author

It looks like

	"/etc/ssl/certs",               // SLES10/SLES11, https://golang.org/issue/12139
	"/etc/pki/tls/certs",           // Fedora/RHEL

are potentially relevant from Go's list.

@tiran
Copy link

tiran commented Jun 8, 2017

On Fedora and RHEL, the /etc/pki/tls/certs directory does not contain any files that match the pattern for capath certs.

@squeaky-pl
Copy link

When I did a similar thing for Portable PyPy I found https://www.happyassassin.net/2015/01/12/a-note-about-ssltls-trusted-certificate-stores-and-platforms/ very helpful.

@codecov
Copy link

codecov bot commented Jun 8, 2017

Codecov Report

Merging #633 into master will increase coverage by 0.24%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #633      +/-   ##
==========================================
+ Coverage   96.78%   97.02%   +0.24%     
==========================================
  Files          18       18              
  Lines        5625     6084     +459     
  Branches      390      497     +107     
==========================================
+ Hits         5444     5903     +459     
- Misses        121      122       +1     
+ Partials       60       59       -1
Impacted Files Coverage Δ
src/OpenSSL/SSL.py 94.9% <100%> (+0.13%) ⬆️
tests/test_ssl.py 99.11% <100%> (+0.01%) ⬆️
tests/test_crypto.py 98.57% <0%> (ø) ⬆️
src/OpenSSL/crypto.py 97.34% <0%> (+0.95%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9280c5...1959382. Read the comment docs.

@reaperhulk
Copy link
Member Author

Thanks for the info @squeaky-pl !

@reaperhulk
Copy link
Member Author

I have significantly modified the approach (and added a lot of comments to explain the logic). Coverage will be abysmal at the moment but I'd like to see if @tiran and @Lukasa are more comfortable with this.

@reaperhulk reaperhulk force-pushed the default-path-fun branch 2 times, most recently from d014394 to 72f2ab9 Compare June 9, 2017 16:51
pyca/cryptography will shortly begin shipping a wheel. Since
SSL_CTX_set_default_verify_paths uses a hardcoded path compiled into the
library, this will start failing to load the proper certificates for
users on many linux distributions. To avoid this we can use the Go
solution of iterating over a list of potential candidates and loading
it when found.
This now checks to see if env vars are set as well as seeing if the
dir exists and has valid certs in it. If either of those are true (or
the number of certs is > 0) it won't load the fallback. If it does do
the fallback it will also attempt to load certs from a dir as a final
fallback
@@ -130,6 +132,19 @@ class _buffer(object):
SSL_CB_HANDSHAKE_START = _lib.SSL_CB_HANDSHAKE_START
SSL_CB_HANDSHAKE_DONE = _lib.SSL_CB_HANDSHAKE_DONE

# Taken from https://golang.org/src/crypto/x509/root_linux.go
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about the BSD, Plan9 (lol), or Solaris values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't ship a precompiled binary for those platforms so we shouldn't need to care.

file_env_var = _ffi.string(
_lib.X509_get_default_cert_file_env()
).decode("ascii")
if not self._verify_env_vars_set(dir_env_var, file_env_var):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please name this check_env_vars_set.

@@ -699,8 +714,98 @@ def set_default_verify_paths(self):

:return: None
"""
# This function will attempt to load certs from both a cafile and
# capath that are set at compile time. However, it will first check
# environment variables and, if present, load those paths instead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment feels like it's in the wrong place. It checks the default paths before manualyl doing the env vars.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSL_CTX_set_default_verify_paths itself checks env vars. It's part of OpenSSL itself. The logic here should look like this:

  1. Attempt to load via SSL_CTX_set_default_verify_paths. This will use the env vars specified by OpenSSL preferentially, but if they're not set (as they are not in most cases) then it will load the CA file and CA dir that were set at compile time.
  2. If env vars are set we do not attempt to load any fallbacks, but if they are not we go to the fallback path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, I see the ambiguity, "This function" refers to SSL_CTX_set_default_verify_paths but I thought it meant "the function we are in". Can you change the comment?

# objects are present. However, the cert directory (capath) is
# lazily loaded and num will always be zero so we need to check if
# the dir exists and has valid file names in it to cover that case.
num = self._check_num_store_objects()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you load your own roots and then call set_default? What should happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this patch it would load the system trust store properly even if you already had certs added. With this patch it would fail to do so. That is not what we want. Ideas? We could potentially track calls to load_verify_locations and run the fallback if it hasn't been called previously, but that is pretty ugly.

What if, instead of checking the number of certs in the store, we checked to see if the default cert file path and default cert dir path don't exist. If both are not present then we'd use fallbacks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Count the number of certs at the start, and instead of checking if the number after calling SSL_CTX_set... is zero, check if it's the same as at the start. (And add a test for it!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That also works, although I'm wondering if it'd make more sense to just say if the file is there then we shouldn't need fallbacks.

Another idea would be that we could compile our OpenSSL such that the default dir and file is very unique. Like /pyca/cryptography/openssl/cert.pem. Then our entire fallback check could just be "is the default dir/file this value", because that would be a guard value to tell us we're a manylinux1 build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're saying dump most of this logic and instead check "is this is a pyca/cryptography OpenSSL, then try these known distro paths"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't think of that until just now so I'm not sure I've thought through the implications, but it seems like rather than trying to detect if a system has successfully loaded roots maybe we're better off only modifying our behavior when we know we're running under a manylinux1 cryptography build.

:return: bool
"""
return (
os.environ.get(file_env_var, None) is not None or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the None arg to get

return any(
[re.match(b'^[0-9a-f]{8}\.[0-9]', x) is not None for x in l]
)
except OSError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the os.listdir should be wrapped in the except, and only ENOENT should be caught, anyhting else can be reraised. Maybe EPERM as well, not sure.

try:
dir_var = "CUSTOM_DIR_VAR"
file_var = "CUSTOM_FILE_VAR"
os.environ[dir_var] = "value"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use monkeypatch for fucking with teh env.

@alex
Copy link
Member

alex commented Jun 21, 2017 via email

@alex
Copy link
Member

alex commented Jun 21, 2017 via email

"/etc/ssl/certs", # SLES10/SLES11
]

_CRYPTOGRAPHY_MANYLINUX1_CA_DIR = "/pyca/cryptography/openssl/certs"
Copy link
Member Author

@reaperhulk reaperhulk Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values will need to be set by using --openssldir=/pyca/cryptography/openssl --prefix=/pyca/cryptography/openssl as args to the configure script in pyca/infra#98

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the point of these values just to act as a sigil to detect if we're inside a manylinux1 wheel or not? Do we expect people to ever put anything inside of these directories?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer is it's a sigil, we don't expect people to put stuff there, but as an intermediate hack it's maybe handy that someone could put something there in a pinch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we expect people to maybe put something there in a pinch, maybe it should conform to a more standard location like /opt/pyca/cryptograpy/openssl/ or something? Not a big deal either way though, but it feels weird to have a top level /pyca/ path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go ahead and make the change. Hopefully users will never need to put something in that path, but I guess it's not crazy to want the path to be sane so that it could be done.

alex
alex previously approved these changes Jun 21, 2017
Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Would also like a review from @dstufft or someone else, and I think we should hold off on merging until we finalize the manylinux1 side of this.

@reaperhulk
Copy link
Member Author

reaperhulk commented Jun 29, 2017

pyca/infra#98 has landed and pyca/cryptography#3736 is now in review. This should no longer be blocked.

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but I'd like additional reviews from other folks!

@alex alex added this to the 17.0.0 milestone Jun 29, 2017
@reaperhulk reaperhulk modified the milestones: 17.0.0, 17.1.0 Jun 29, 2017
@alex alex merged commit 55fb341 into pyca:master Jun 29, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Call SSL_CTX_load_verify_locations by default when initializing an SSL_CTX
6 participants