Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try loading trusted certs from a list of fallbacks #633

Merged
merged 17 commits into from
Jun 29, 2017
Merged
105 changes: 105 additions & 0 deletions src/OpenSSL/SSL.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import os
import re
import socket
from sys import platform
from functools import wraps, partial
Expand Down Expand Up @@ -130,6 +132,19 @@ class _buffer(object):
SSL_CB_HANDSHAKE_START = _lib.SSL_CB_HANDSHAKE_START
SSL_CB_HANDSHAKE_DONE = _lib.SSL_CB_HANDSHAKE_DONE

# Taken from https://golang.org/src/crypto/x509/root_linux.go
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about the BSD, Plan9 (lol), or Solaris values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't ship a precompiled binary for those platforms so we shouldn't need to care.

_CERTIFICATE_FILE_LOCATIONS = [
"/etc/ssl/certs/ca-certificates.crt", # Debian/Ubuntu/Gentoo etc.
"/etc/pki/tls/certs/ca-bundle.crt", # Fedora/RHEL 6
"/etc/ssl/ca-bundle.pem", # OpenSUSE
"/etc/pki/tls/cacert.pem", # OpenELEC
"/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem", # CentOS/RHEL 7
]

_CERTIFICATE_PATH_LOCATIONS = [
"/etc/ssl/certs", # SLES10/SLES11
]


class Error(Exception):
"""
Expand Down Expand Up @@ -699,8 +714,98 @@ def set_default_verify_paths(self):

:return: None
"""
# This function will attempt to load certs from both a cafile and
# capath that are set at compile time. However, it will first check
# environment variables and, if present, load those paths instead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment feels like it's in the wrong place. It checks the default paths before manualyl doing the env vars.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSL_CTX_set_default_verify_paths itself checks env vars. It's part of OpenSSL itself. The logic here should look like this:

  1. Attempt to load via SSL_CTX_set_default_verify_paths. This will use the env vars specified by OpenSSL preferentially, but if they're not set (as they are not in most cases) then it will load the CA file and CA dir that were set at compile time.
  2. If env vars are set we do not attempt to load any fallbacks, but if they are not we go to the fallback path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, I see the ambiguity, "This function" refers to SSL_CTX_set_default_verify_paths but I thought it meant "the function we are in". Can you change the comment?

set_result = _lib.SSL_CTX_set_default_verify_paths(self._context)
_openssl_assert(set_result == 1)
# After attempting to set default_verify_paths we need to know whether
# to go down the fallback path.
# First we'll check to see if any env vars have been set. If so,
# we won't try to do anything else because the user has set the path
# themselves.
dir_env_var = _ffi.string(
_lib.X509_get_default_cert_dir_env()
).decode("ascii")
file_env_var = _ffi.string(
_lib.X509_get_default_cert_file_env()
).decode("ascii")
if not self._verify_env_vars_set(dir_env_var, file_env_var):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please name this check_env_vars_set.

# If no env vars are set next we want to see if any certs were
# loaded. For a cafile this is simple and we can just ask how many
# objects are present. However, the cert directory (capath) is
# lazily loaded and num will always be zero so we need to check if
# the dir exists and has valid file names in it to cover that case.
num = self._check_num_store_objects()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you load your own roots and then call set_default? What should happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this patch it would load the system trust store properly even if you already had certs added. With this patch it would fail to do so. That is not what we want. Ideas? We could potentially track calls to load_verify_locations and run the fallback if it hasn't been called previously, but that is pretty ugly.

What if, instead of checking the number of certs in the store, we checked to see if the default cert file path and default cert dir path don't exist. If both are not present then we'd use fallbacks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Count the number of certs at the start, and instead of checking if the number after calling SSL_CTX_set... is zero, check if it's the same as at the start. (And add a test for it!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That also works, although I'm wondering if it'd make more sense to just say if the file is there then we shouldn't need fallbacks.

Another idea would be that we could compile our OpenSSL such that the default dir and file is very unique. Like /pyca/cryptography/openssl/cert.pem. Then our entire fallback check could just be "is the default dir/file this value", because that would be a guard value to tell us we're a manylinux1 build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're saying dump most of this logic and instead check "is this is a pyca/cryptography OpenSSL, then try these known distro paths"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't think of that until just now so I'm not sure I've thought through the implications, but it seems like rather than trying to detect if a system has successfully loaded roots maybe we're better off only modifying our behavior when we know we're running under a manylinux1 cryptography build.

default_dir = _ffi.string(_lib.X509_get_default_cert_dir())
if num == 0 and not self._default_dir_exists(default_dir):
# No certs and no default dir, let's load our fallbacks
self._fallback_default_verify_paths(
_CERTIFICATE_FILE_LOCATIONS,
_CERTIFICATE_PATH_LOCATIONS
)

def _verify_env_vars_set(self, dir_env_var, file_env_var):
"""
Check to see if the default cert dir/file environment vars are present.

:return: bool
"""
return (
os.environ.get(file_env_var, None) is not None or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the None arg to get

os.environ.get(dir_env_var, None) is not None
)

def _default_dir_exists(self, default_dir):
"""
Check to see if the default cert dir exists and has filenames in a
valid form.

:return: bool
"""
try:
l = os.listdir(default_dir)
# the dir exists, but we need to know if there are any valid
# certs in there. OpenSSL requires a hashed naming scheme of the
# form: [0-9a-f]{8}\.[0-9]
# Arguably this is overkill and we could just assume that if the
# dir exists it's fine.
return any(
[re.match(b'^[0-9a-f]{8}\.[0-9]', x) is not None for x in l]
)
except OSError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the os.listdir should be wrapped in the except, and only ENOENT should be caught, anyhting else can be reraised. Maybe EPERM as well, not sure.

return False

def _check_num_store_objects(self):
"""
Checks how many objects are present in the Context's X509Store.

:return: int
"""
store = self.get_cert_store()
sk_obj = _lib.X509_STORE_get0_objects(store._store)
_openssl_assert(sk_obj != _ffi.NULL)
return _lib.sk_X509_OBJECT_num(sk_obj)

def _fallback_default_verify_paths(self, file_path, dir_path):
"""
Default verify paths are based on the compiled version of OpenSSL.
However, when pyca/cryptography is compiled as a manylinux1 wheel
that compiled location can potentially be wrong. So, like Go, we
will try a predefined set of paths and attempt to load roots
from there.

:return: None
"""
for cafile in file_path:
if os.path.isfile(cafile):
self.load_verify_locations(cafile)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the file exists but is invalid this will currently raise an exception. Do we want to catch this and silently swallow it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to the catch-22 of this approach.

If you throw an exception then users on, say, Gentoo, are going to be real mad that they can no longer use the system trust store just because they put some random data at "/etc/pki/tls/cacert.pem", a path that has no special meaning on their system. If you don't throw an exception then you're silently swallowing potentially important errors. Maybe log?

Fun follow-on: should you load all certs at all these places? Does this open the user up to risks about having unexpected or malicious root certs elsewhere in the system suddenly be trusted by Python but nothing else? Doesn't this lead to PyOpenSSL behaving fundamentally differently to their native applications? Does this fundamentally mean that we should all abandon PyOpenSSL and just use PEP 543 instead? ARE WE NOT ALL DOOMED TO DIE ALONE?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the approach, too. I could be persuaded if you detect the distribution first (e.g. by parsing /etc/os-release and using ID and ID_LIKE, https://www.freedesktop.org/software/systemd/man/os-release.html) and only handle distribution specific paths for cafile and capath.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa these are all system directories where I wouldn't expect users to be dropping random files very often. And, if they do, the same problem applies to Go, which honestly seems like a reasonable defense.

Maybe raising a UserWarning if a file is found but is invalid? Or we could continue the loop and look for something else.

@tiran It's impractical to enumerate every possible distribution. What makes you think Go's approach doesn't work? It's not like Go binaries are unusual things and every TLS client connection made by one is doing trust roots in this fashion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go does more work to find the actual location. It takes the env vars into account and stops looking for certs when it finds a directory that contains at least on cert. https://github.com/golang/go/blob/master/src/crypto/x509/root_unix.go#L63

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe related, I do something similar to in Portable PyPy which ships OpenSSL as well: https://github.com/squeaky-pl/portable-pypy/blob/master/ssl3.py.patch

30.000 downloads after this change and nobody complained, which of course doesn't mean it's a right thing to do.

break

for capath in dir_path:
if os.path.isdir(capath):
self.load_verify_locations(None, capath)
break

def use_certificate_chain_file(self, certfile):
"""
Expand Down
92 changes: 91 additions & 1 deletion tests/test_ssl.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"""

import datetime
import os
import sys
import uuid

Expand Down Expand Up @@ -57,7 +58,7 @@
Context, ContextType, Session, Connection, ConnectionType, SSLeay_version)
from OpenSSL.SSL import _make_requires

from OpenSSL._util import lib as _lib
from OpenSSL._util import ffi as _ffi, lib as _lib

from OpenSSL.SSL import (
OP_NO_QUERY_MTU, OP_COOKIE_EXCHANGE, OP_NO_TICKET, OP_NO_COMPRESSION,
Expand Down Expand Up @@ -1108,6 +1109,95 @@ def test_load_verify_locations_wrong_args(self):
with pytest.raises(TypeError):
context.load_verify_locations(object(), object())

@pytest.mark.skipif(
platform != "linux",
reason="Loading fallback paths is a linux-specific behavior to "
"accommodate pyca/cryptography manylinux1 wheels"
)
def test_fallback_default_verify_paths(self, monkeypatch):
"""
Test that we load certificates successfully on linux from the fallback
path.
"""
context = Context(TLSv1_METHOD)
monkeypatch.setattr(
_lib, "SSL_CTX_set_default_verify_paths", lambda x: 1
)
monkeypatch.setattr(context, "_default_dir_exists", lambda x: False)
context.set_default_verify_paths()
num = context._check_num_store_objects()
assert num != 0

def test_verify_env_vars(self):
"""
Test that we return True/False appropriately if the env vars are set.
"""
context = Context(TLSv1_METHOD)
original = dict(os.environ)
try:
dir_var = "CUSTOM_DIR_VAR"
file_var = "CUSTOM_FILE_VAR"
os.environ[dir_var] = "value"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use monkeypatch for fucking with teh env.

assert context._verify_env_vars_set(dir_var, file_var) is True
os.environ[file_var] = "value"
assert context._verify_env_vars_set(dir_var, file_var) is True
del os.environ[dir_var]
del os.environ[file_var]
assert context._verify_env_vars_set(dir_var, file_var) is False
finally:
os.environ.clear()
os.environ.update(original)

def test_verify_no_fallback_if_env_vars_set(self, monkeypatch):
"""
Test that we don't use the fallback path if env vars are set.
"""
context = Context(TLSv1_METHOD)
monkeypatch.setattr(
_lib, "SSL_CTX_set_default_verify_paths", lambda x: 1
)
original = dict(os.environ)
dir_env_var = _ffi.string(
_lib.X509_get_default_cert_dir_env()
).decode("ascii")
file_env_var = _ffi.string(
_lib.X509_get_default_cert_file_env()
).decode("ascii")
try:
os.environ[file_env_var] = "nonsense-value"
os.environ[dir_env_var] = "nonsense-value"
context.set_default_verify_paths()
num = context._check_num_store_objects()
assert num == 0
finally:
os.environ.clear()
os.environ.update(original)

def _fallback_path_is_not_file_or_dir(self, monkeypatch):
"""
Test that when passed empty arrays or paths that do not exist no
errors are raised.
"""
context = Context(TLSv1_METHOD)
context._fallback_default_verify_paths([], [])
context._fallback_default_verify_paths(
["/not/a/file"], ["/not/a/dir"]
)

def test_default_dir_doesnt_exist(self):
"""
Test that we raise catch OSError for both non-existent dirs and paths
that are not directories.
"""
context = Context(TLSv1_METHOD)
assert context._default_dir_exists(
b"/nonexistent/path/probably"
) is False

assert context._default_dir_exists(
os.path.dirname(os.path.abspath(__file__)).encode("ascii")
) is False

@pytest.mark.skipif(
platform == "win32",
reason="set_default_verify_paths appears not to work on Windows. "
Expand Down