HTTP Routing and TLS#
Subsystem Goal#
This subsystem is responsible for the routing of all HTTP-based traffic, ensuring requests end up going to the correct application. By providing automatic certificate provisioning, we can require HTTPS on all requests.
Components in Use#
While working on this subsystem, we will introduce the following components:
- Traefik - a cloud-native proxy that can be configured to watch Ingress objects to update its routing config
- Cert Manager - a Kubernetes-based certificate manager that can leverage LetsEncrypt (or other ACME providers) or use an in-cluster CA to provide certificates
Background#
Understanding Ingress#
In Kubernetes, Ingress is the standardized method to define HTTP routing rules. These rules provide the ability to route based on hostname or path and provide TLS configuration. Below is an example:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hello-world
spec:
rules:
- host: docs-getting-started.tenants.platform.it.vt.edu
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: hello-world
port:
number: 3000
tls:
- hosts:
- docs-getting-started.tenants.platform.it.vt.edu
secretName: hello-world-tls-cert
This object indicates that requests sent to docs-getting-started.tenants.platform.it.vt.edu
should be forwarded to a Service named hello-world
on its port 3000. It also
indicates that the TLS key and certificate for the same host are stored in a
secret called hello-world-tls-cert
(more on that in a moment).
Ingress Controllers#
It's imporant to note that defining an Ingress
does nothing on its own, as these
are simply objects defining desired state. In order to actually perform routing,
you must deploy an ingress controller.
These controllers will watch the Ingress objects and update their routing config.
They are typically exposed as a LoadBalancer or NodePort Service and serve
as the entry point for all HTTP requests into the clustered environment. This is
where Traefik plugs in.
Traefik is configured to watch all of the Ingress
objects and updates its routing
rules appropriately. It is the single reverse proxy that sits at the edge of the
cluster and performs routing to all of the applications.
Certificate Management#
As seen earlier, the Ingress object allows TLS configuration to be defined
and simply references a Kubernetes secret that contains a private key and certificate.
That secret must be of type kubernetes.io/tls
and have the following structure:
apiVersion: v1
kind: Secret
metadata:
name: secret-tls
type: kubernetes.io/tls
data:
# the data is abbreviated in this example
tls.crt: |
MIIC2DCCAcCgAwIBAgIBATANBgkqh ...
tls.key: |
MIIEpgIBAAKCAQEA7yn3bRHQ5FHMQ ...
While tenants are certainly welcome to provide their own SSL credentials, we wanted to remove the burden of provisioning and renewing certificates. Fortunately, the ACME protocol makes it possible to automate this process (read this blog post to better understand the protocol). By plugging in an ACME client, we can automatically provision and renew certificates and store them as secrets where an ingress controller can use them.
Cert Manager provides a Kubernetes-based approach to
managing certificates. It provides the ability to define a Certificate
, which
serves as a request for a TLS certificate. By defining the various ways certs
can be issued (by defining ClusterIssuer
objects), we can support LetsEncrypt
or a variety of other issuers. Once a Certificate
is defined, the user never
needs to worry about expired certs, as cert-manager will automatically rotate
soon-to-be-expiring certificates!
Deploying it Yourself#
In this example, we're going to deploy Traefik onto our local Docker Desktop environment and deploy two simple applications.
Configuring minikube ingress#
-
Enable ingress addon
-
Add hostname to /etc/hosts
-
You will need to be running a minikube tunnel in order to connect to things
You will need to leave it running.
Deploying an Ingress Controller#
-
Deploy Traefik by installing the Helm chart.
helm repo add traefik https://helm.traefik.io/traefik helm repo update helm install traefik traefik/traefik --namespace=platform-traefik --create-namespace --set ports.websecure.tls.enabled=true --set 'additionalArguments[0]=--serverstransport.insecureskipverify'
By specifying
ports.websecure.tls.enabled=true
, we enable TLS on all routes. Otherwise, everyIngress
will need an additional annotation to enable TLS.The
--serverstransport.insecureskipverify
will let us use self-signed certs in communication from Traefik to the pod (which will be important in later subsystems). -
After a moment, you should see the Traefik pod running in the
platform-traefik
namespace. -
If you want to open the Traefik dashboard, you can use port-forwarding!
And then open your browser to http://localhost:9000/dashboard/.
Deploying a Sample Application#
-
Now, let's create a namespace to run our simple apps.
-
Let's deploy an app with a Service! In this case, we'll simply use nginx.
-
Let's deploy another app. This time, we'll use a slightly different image:
nginxdemos/hello
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: app2 namespace: ingress-demo labels: app: nginx-hello-world spec: containers: - name: nginx image: nginxdemos/hello --- apiVersion: v1 kind: Service metadata: name: app2 namespace: ingress-demo spec: selector: app: nginx-hello-world ports: - port: 80 EOF
-
Now, let's define two different Ingress objects for each of the apps. We'll put the first app at app1.localhost and the second at app2.localhost.
cat <<EOF | kubectl apply -f - apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: app1 namespace: ingress-demo spec: rules: - host: app1.localhost http: paths: - path: / pathType: Prefix backend: service: name: app1 port: number: 80 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: app2 namespace: ingress-demo spec: rules: - host: app2.localhost http: paths: - path: / pathType: Prefix backend: service: name: app2 port: number: 80 EOF
-
Now, open your browser to http://app1.localhost. You should see the default nginx landing page. But, if you open http://app2.localhost, you should see a different app! That's because Traefik observed the routing rules and sent the traffic to the correct pod based on the host header.
Deploying a Self-Signed In-Cluster CA#
Now that we have two applications up and running, let's deploy Cert Manager and configure it to use an in-cluster CA (since we can't easily satisfy LetsEncrypt challenges locally).
-
Install Cert Manager by installing its Helm chart:
helm repo add jetstack https://charts.jetstack.io helm repo update helm install cert-manager jetstack/cert-manager --namespace platform-cert-manager --create-namespace --set installCRDs=true
After a moment, you should see the cert-manager pods running in the
platform-cert-manager
namespace. -
To issue certificates, we need to define a
ClusterIssuer
. The issuer provides the configuration needed to satisfy a certificate request. For example, we can have one issuer that uses an in-cluster CA while another uses LetsEncrypt.Run the following to create a root CA and an issuer that can be used to secure
cat <<EOF | kubectl apply -f - apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: platform-internal-root spec: selfSigned: {} --- apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: platform-internal-ca namespace: platform-cert-manager spec: secretName: platform-internal-ca duration: 43800h # 5y issuerRef: kind: ClusterIssuer name: platform-internal-root commonName: "ca.platform.cert-manager" isCA: true --- apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: platform-internal-ca spec: ca: secretName: platform-internal-ca EOF
Creating Certificates for our Apps#
Now that we have a ClusterIssuer defined, let's request some Certificates and use them for our two sample apps!
-
Run the following command to define a
Certificate
for app1. Note that we indicate the TLS details should be stored in a secret namedapp1-tls
. -
After a few seconds, you should see the
Certificate
is ready for use:And you should see output that looks something like this:
And if we look at the
app1-tls
secret, we'll see it both exists and is populated:With output:
apiVersion: v1 data: ca.crt: LS0tLS... tls.crt: LS0tLS... tls.key: LS0tLS... kind: Secret metadata: annotations: cert-manager.io/alt-names: app1.localhost cert-manager.io/certificate-name: app1 cert-manager.io/common-name: app1.localhost cert-manager.io/ip-sans: "" cert-manager.io/issuer-group: "" cert-manager.io/issuer-kind: ClusterIssuer cert-manager.io/issuer-name: platform-internal-ca cert-manager.io/uri-sans: "" creationTimestamp: "2022-02-11T14:59:41Z" name: app1-tls namespace: ingress-demo resourceVersion: "1031225" uid: 9069ad14-234d-4ca0-99de-8bf59bd16ca9 type: kubernetes.io/tls
-
Now, we're going to update our
Ingress
object for app1 to include the TLS configuration. We simply need to point it to the correct secret.cat <<EOF | kubectl apply -f - apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: app1 namespace: ingress-demo spec: rules: - host: app1.localhost http: paths: - path: / pathType: Prefix backend: service: name: app1 port: number: 80 tls: - hosts: ["app1.localhost"] secretName: app1-tls EOF
-
You can try to open your browser to https://app1.localhost, but it might fail because Traefik uses HSTS, which forces the browser to not allow you to click through untrusted sites.
So, we can just use cURL! If you fetch the site, you should see the correct name going from our internal CA!
You should see output that looks something like this...
-
If you want, you can do the same thing for app2!
What's next?#
Now that we have an ingress controller and certificate management deployed, let's make it so we can deploy an actual application using a tenant workflow!
Go to the GitOps subsystem now!
Common Troubleshooting Notes#
Why is my certificate not provisioning?
The cert-manager tool has a troubleshooting guide, which is a good starting step. There are also additional steps when troubleshooting ACME certificates.
Beyond those, we've also run into a few experiences and observations.
- Ensure DNS is resolving to the platform's name. If the name for the requested certificate doesn't resolve to the platform, the ACME challenge can't be satisfied and a certificate won't be issued.
- DNS caching. If a DNS change was recently made, there's a chance the LetsEncrypt challenge verifier still has the old name cached. In these cases, it's important to make the DNS change before requesting the certificate. In these situations, you'll need to simply sit and wait for the DNS TTL.
- Certificate is just stuck. On rare occasions, we've been able to validate
the
Order
was completed, but theCertificate
is simply never provisioned. It's unclear why, but in every situation, manually removing theCertificate
and re-requesting it has worked.