Ozznotes

This is a blog with random OpenShift, Kubernetes, OpenStack and Linux related notes so I don't forget things. If you find something inaccurate or that could be fixed, please file a bug report here.

View on GitHub

Back to home

3 October 2016

How is TLS powered by certmonger being done

by Juan Antonio Osorio Robles

I’ve been working on trying to get TLS everywhere for TripleO. And, while not everything has merged yet to the project, this is an overview of how things are being done, which I hope helps reviewers have an easier time checking it out. And me sanity-checking the approach.

So what’s this about?

The point is to try to get all the endpoints to listen for TLS connections everywhere that’s possible to do so. So this will include the OpenStack services (which means their HAProxy endpoints and the actual service which is in the internal network), RabbitMQ, MySQL (Database traffic and repplication traffic) and MongoDB; to begin with.

Now, as we know, to set up TLS for a service, we need the service to have a TLS certificate, and a key-pair. However, this certificate needs to be issued by a certificate authority (CA) which entities communicating with that service trust. Also, the installation (provisioning of the certificate and keys) needs to happen on a secure manner. On the other hand, this certificate does expire, so it needs to be renewed when this happens, and this also needs to happen in a secure manner.

While we could just attempt to inject the certificates/keys to the overcloud nodes (as we are able to do for the public-facing certificate of HAProxy), we have to consider that for TLS-everywhere, we also attempt to secure the services on the internal network, which are listening on an interface dedicated to that specific node. So we need certificates for the virtual interfaces which we listen on and also on the actual interfaces of the services, which require a certificate/key per-node. On the other hand, we have to consider that our deployments can have several networks (with most services listening on the internal-api network, but some services actually listening on the control-plane network, and others in the storage network), and there are even plans to make these networks more flexible. So given this, we even need a certificate per-network in each of the nodes. So, lets say that we have 3 controllers, and services are listening on three networks (internal-api, ctlplane and storage); this means that we’ll 9 certificates that are node-specific, and given that we have HAProxy listening on the VIPs, we need more 3 for HAProxy (and the public certificate too).

Now, we could use wildcard certificate to address this, so it would be way easier to deploy. However, this comes with several concerns. To begin with, on the security-side, the compromise of just one node or sub-domain would end up compromising all nodes. On the other hand, when revoking a wildcard certificate, all nodes will need a new certificate, and given the amount of nodes and endpoints we have, this can mean a lot of work.

So, instead, we will actually go for the single-domain certificates. However, we’ll rely on certmonger to handle this automatically for us. Since it will do the certificate request for us, handle renewals, and even do pre-save and post-save commands which we will need (such as getting an appropriate format for the HAProxy certificates or change file ownerships). It needs, however, to communicate to a real CA. So, to get a real CA for certmonger to communicate to, we can go with FreeIPA, which can offer a lot more functionality that we can use in TripleO besides just certificate issuance. However, certmonger can work with other CAs, so we’re not limited on that side.

Note on FreeIPA

For FreeIPA to be able to issue certificates for a node or service. The node and the service need to have a principal on FreeIPA, and to get this, we need to enroll the node. This enrollment allows us to trust FreeIPA as a CA (all nodes registered would trust it, which is what we want anyway) and enables the node a kerberos keytab, which it can use to authenticate and subsequently do requests to FreeIPA.

Enrollment can be a big problem by itself, because we want to also do this in a secure manner. And for this, my colleagues are working on an OpenStack-friendly approach which involved adding hooks to nova. However, since this approach is not available yet, I’m using ExtraConfigPre hooks to pass the necessary data for nodes to do the enrollment. This data is pretty much just the FreeIPA DNS name, an OTP which the node uses to authenticate in the enrollment phase and the kerberos realm.

FreeIPA does not issue certificates for IP addresses, so for the overcloud we will need to use FQDNs for each of the endpoints, and these will be used in both the CNs and the SubjectAltNames of the certificates.

Approach

HAProxy

The undercloud’s HAProxy already can use certmonger for getting the public certificate. Setting this in practice is a matter of setting the generate_service_certificates flag on hiera to true, specifying the specs for the certificate to generate using the certificates_specs parameter for the haproxy profile (which one can set via hiera with the tripleo::profile::base::haproxy::certificates_specs key), telling certmonger which CA to talk to with the certonger_ca hiera key, and finally, telling HAProxy where the relevant PEM file is located via the service_certificate parameter of the haproxy manifest (which is manifests/haproxy.pp and not the service profile).

Since the profile already takes a hash for auto-generating the certificate, we can use this to generate the other certificates we need for the internal/admin endpoints (it’s more than an extra certificate since not all services listen on the internal-api network). The format of the certificates_specs hash for HAProxy is as follows:

haproxy-<NETWORK name>:
  service_pem: '/etc/pki/tls/certs/overcloud-haproxy-<NETWORK name>.pem'
  service_certificate: '/etc/pki/tls/certs/overcloud-haproxy-<NETWORK name>.crt'
  service_key: '/etc/pki/tls/private/overcloud-haproxy-<NETWORK name>.key'
  hostname: "%{hiera('cloud_name_<NETWORK name>')}"
  postsave_cmd: "" # TODO
  principal: "haproxy/%{hiera('cloud_name_<NETWORK name>')}"

Where network name is defined in tripleo-heat-templates. With this in mind, the certificate for the internal-api network, it would look like this:

haproxy-internal_api:
  service_pem: '/etc/pki/tls/certs/overcloud-haproxy-internal_api.pem'
  service_certificate: '/etc/pki/tls/certs/overcloud-haproxy-internal_api.crt'
  service_key: '/etc/pki/tls/private/overcloud-haproxy-internal_api.key'
  hostname: "%{hiera('cloud_name_internal_api')}"
  postsave_cmd: "" # TODO
  principal: "haproxy/%{hiera('cloud_name_internal_api')}"

We can make this more automatic by iterating the networks that the services listen on in tripleo-heat-templates.

Now, HAProxy takes the certificate and key in PEM format, but requires them to be appended together. This is why we have a field for the certificate, the key, and the appended PEM that will actually be read by HAProxy. And fortunately, we have entries in hiera to get the FQDN that’s assigned to a VIP for each of the networks, so we’ll use those.

So, having these specs for the certificates, we’ll just let puppet execute ensure_resources using that hash, and it’ll end up calling certmonger to do the hard work for us.

For HAProxy we can only pass the path of the PEM file for the public endpoints, and, given that we have the paths for the certificates in the spec, I thought it would be a good idea to pass in the spec itself to the haproxy manifest, and get the paths from there. This way, for each service we can choose an appropriate certificate depending on the network it’s listening on, thus reducing the complexity of this.

Finally, we have to remember to set the endpoints for the services to https, which we do by having an environment that sets the values of the EndpointMap to https in all endpoints, and uses CLOUDNAME for all the endpoints (since we need to use FQDNs and not IP addresses).

OpenStack services

Some services are already running over Apache HTTPd, so these we can easily start running with TLS enabled. However, We don’t want to run cryptographic operations in Python. So we’ll do these services separately.

Services running over Apache HTTPd

Taking as a reference the same approach we used for HAProxy, we’ll do something similar here. We’ll re-use the generate_service_certificates flag and base the certificate provisioning on hashes which we refer as specs. However, we’ll also add a flag that tells the services whether to get the paths of the certificates and pass those to the services or not. We’ll call this flag enable_internal_tls and pass it via hiera.

Now, since httpd does take a separate file for the certificate and the key, the specs don’t need the service_pem key. So our spec for the certificates will look as the following:

httpd-<NETWORK name>:
  service_certificate: '/etc/pki/tls/certs/httpd-<NETWORK name>.crt'
  service_key: '/etc/pki/tls/private/httpd-<NETWORK name>.key'
  hostname: "%{::fqdn_<NETWORK name>}"
  principal: "HTTP/%{::fqdn_<NETWORK name>}"

Noting that we need a certificate per-network since some OpenStack services also listen on networks different than internal-api. Thankfully, we do have facts in puppet to get the hostname for the node depending on the network, so we use those in the specs (we change this to hiera in the near future).

We already have mod_ssl installed in the overcloud nodes (since it’s part of the image) so enabling TLS with the paths that come from the specs is just a matter of passing those paths to the vhost resource, and puppet will do its work.

Note on the advantage of using specs

Unfortunately, not everyone will want to use certmonger for requesting and managing certificates. So using the “specs” based approach that I described above has a side-effect that can address this. If a deployer REALLY doesn’t want to use certmonger. It’s possible to pass the certificates and keys manually using an ExtraConfigPre hook in TripleO that copies the certificates and keys to appropriate locations (which could be implemented in a similar fashion as we do in TripleO for the public certificate for HAProxy). Then, the specs that tell the services where to find those certificates and keys can be passed via hieradata. And finally, we need to be sure that we’re not setting the generate_service_certificates flag, since this will cause puppet to try to use those specs to call certmonger. Furthermore, some things that need to be kept in mind is that a specific format needs to be used for the specs depending on the service and the network; also, you now have to run the overcloud deploy every time you need to update the certificates, since they won’t be managed by certmonger.

tags: tripleo - openstack

Back to home