SELinux as a resource in Kubernetes
by Juan Antonio Osorio Robles
It’s not a new thing that you can (since a while ago) configure the SELinux labels for your containers in kubernetes. And while that has been really good, I haven’t seen it in use very often.
On the other hand, what I have seen in use is the usage of privileged containers… All over the place.
While the promise of containers was to get some isolation and in some cases, even extra security for your workloads. That doesn’t seem to trickle down to certain types of components. For instance, anything that has to do with infrastructure, e.g. log forwarding components such as fluentd. Tends to run as a privileged container, just so the pod can bind-mount the /var/log directory to read the logs from it.
So, basically the pod would have permission to do anything with whatever is bind-mounted into it… Which is not great for security!
One reason why people keep using privileged pods as opposed to creating SELinux policies for their workloads is because it’s easier to do so, and SELinux has always been viewed as hard. Let alone writing your own policies!
I liked that project so much, that I even tried to integrate it better with Kubernetes. A solution which I described in yet another blog post.
As mentioned in that blog post, for the sake of trying to automate and make it
easier for folks to start generating their policies, I started experimenting
with what I called the
selinux-policy-helper-operator. This operator had the
small task of detecting if a pod was scheduled with a certain annotation,
detect the node where it’s scheduled, and generate an appropriate policy for
it. That policy would the be pushed as a ConfigMap for the deployer to check
out and apply.
This was a nice Proof-Of-Concept, and it was really fun to code! However, this didn’t fully solve all the issues.
Generating a policy is one step of the whole process. And while it’s very useful and addresses a big gap in getting people to use SELinux to better contain their workloads. There are other gaps that still need to be filled.
How do I install my module?
OK! So Udica helped us out to generate a policy for our containers… How do we install it in the cluster?
My initial approach was to simply call a pod on a specific node that would run
semodule -i, which basically installs the module. Since the module was
generated on a ConfigMap, it was fairly simple to just mount it on the pod and
it would get intalled.
This was easy enough… But it’s tedious and not a great developer experience. It also has several other shortcomings.
What modules have I installed?
While normally you would keep track of some set of modules via packages in your distribution (e.g. there is an rpm for the OpenStack SELinux modules), with this approach you can easily loose track of the modules you installed… And there is nothing accounting or keeping track of them.
Of course, you can still do
semodule -l on each node in your cluster to list
the installed modules, and subsequently run
semodule -r to uninstall the
module if no longer needed. But in a world where we want to automate
everything, and in a world where we want to see the cluster as one big entity,
as opposed to working on a host-level, this approach is also not ideal. It also
doesn’t tie in very well with the Kubernetes way to do things.
Who can use the module?
Another being issue is security…
Imagine that you create a SELinux module that allows you to read and write to /var/log. You can happily take it into use as follows:
Imagining that our policy is named
readvarlog, you happily install your
module in the cluster, you deploy your workload, and done! You’re ready to go
and happy! Right…?
Well! Unfortunately there is no guarantee that the SELinux context will only be used by that one worload, or that one namespace. Anybody can take it into use! Which is not exactly what you want… So, you could just use it in another namespace like this:
So now you have a pod that has permission to tamper with the logs.
Wouldn’t it be nice to be able to limit who’s able to use a specific module? Or at least limit the usage of the module to a specific namespace?
So, with the purpose of trying to solve all the aforementioned problems, I started coding what I called the selinux-operator.
The goal was to try to integrate SELinux as much as possible to Kubernetes. And what a better way to do that than making the policies be objects tracked directly by Kubernetes!
SELinux modules can be represented by the
SelinuxPolicy Custom Resource,
which would look as follows:
When running the operator, if it detects that a
SelinuxPolicy CR is created,
it’ll spawn a pod per node in the cluster and install the policy in them. The
pods will stay running (though inactive), and will remove the policy from the
nodes if the CR is deleted.
Note that only policies with the flag
apply set to
true will be created.
This ensures that folks have a chance to audit and review their policies
before deploying them. By default (without the
apply flag) policies and not
The CR is also namespaced, so the policy will only be visible and usable by folks operating in the same namespace where the policy exists. Giving a little more limitation and isolation.
You would be able to see what modules are available in your namespace, simply by doing
Given that the policies are namespaced, to avoid name collisions, I did enforce the namespace to be embedded in the resulting policy name. So, to take the above policy into use in your workload, you would need to call it as follows:
The format being
<name of the selinux policy object>_<namespace>.process
There is also a small validating webhook running in the operator which you can deploy, and it’ll validate that the policy that you’re trying to use exists, and that it’s available and usable in the namespace that the pod belongs to. So, it’ll check the name format of the policy, and that the the namesace of the policy itself and the namespace of the pod match.
Back to the selinux-policy-helper-operator
Having this in mind, I decided to go back to the policy helper operator and
rewrite it to generate the
SelinuxPolicy Custom Resource instead of
I also took the opportunity to rewrite the operator using operator-sdk (of which I’m a big fan of) instead of kubebuilder, which I used originally.
And the work is there now!
The same annotations work as before, and the operator will aide you in generating policies!
More fine-grained RBAC
While we can already limit who’s allowed to do CRUD operations on the policies in the cluster. The operator and webhook still don’t have the ability to tell who’s able to use the policy. It would be great to be able to limit the usage of the policy to just a certain specific service account, instead of the whole namespace.
Who installs the installer?
While these operators allow you to generate your own policies and use them on your workloads… These pods are privileged containers too! I still don’t like that.
One option would be to include the policy as part of the cluster installation (thus allowing us to run the operator using that policy). But… That would allow any other pod to use that policy.
What if we could use this same concept to bootstrap the operator itself?
One idea would be to spawn a “bootstrap” privileged pod that installs the policy that the operator needs, and subsequently the operator would spawn pods using that specific policy.
Make it easily available!
Right now you have to download the repos and manually deploy the manifests for the operators. The idea would be to include the operators in the marketplace in order to make them widely available for folks! Of course if there’s enough interest :)
tags: openshift - kubernetes - udica - selinux