Skip to content

Cluster Overview#

The platform team currently maintains several Kubernetes clusters to support the development and operations of the platform. This page describes those clusters and the different use cases for each.

AWS EKS Environment#

All clusters running in AWS leverage EKS to run the Kubernetes control plane. This is the default location for all of our tenant workloads.

AWS Network Diagram

Load Balancers#

The AWS Clusters use an AWS Loadbalancer provisioned by the ingress created by the traefik helm chart.

Networking#

The AWS Clusters utilize Cilium which runs on top of the three vpcs, one per AZ. By default, EKS clusters run Amazon's VPC CNI, is uninstalled and replaced with Cilium.

Storage#

EBS#

Elastic Block Storage is the default storage class on the cluster.

EFS#

EFS is currently provisioned as part of the cluster code and thus has to be added per customer requesting it.

S3#

S3 Storage is not currently surfaced to customers as part of the cluster. It is used by the platform in various ways to store information to support internal infrastructure.

Nothing prohibits customers from referencing outside S3, but at the moment, we do not provide an object storage class.

Logging#

Splunk#

There is a daemonset of Filebeat running on all the nodes collecting logs from the systems and pods and forwarding them into Splunk.

Cloudwatch#

Additional logging for Amazon components is available in Cloudwatch. Cloudwatch logs are not currently exported to Splunk.

DVLP cluster#

PPRD cluster#

PROD cluster#

Configuring local CLI for admin access#

See Running Terraform Locally for now. We will eventually have the same SSO setup for both baseline and non-baseline clusters.

On-prem EKS-A Environment#

All clusters running on-prem leverage EKS-A to run the Kubernetes control plane. On-prem resources are limited by the amount of hardware we can dedicate to it.

On-prem Network Diagram

Load Balancers#

The EKS-A On Premesis clusters reside in a private subnet behind the F5 which exposes Traefik for hosted application.

Node Routing

Outbound traffic from the nodes behind the F5 to the internet get SNATed with our public address of the F5. However, if the IP is local to VT's network, it will be routed using its private address.

Networking#

The EKS Anywhere natively utilize Cilium which the team has left in place. Current procedure it to allow the version to float with the EKS Anywhere lifecycle management tools.

Storage#

vSAN#

vSAN is the default storage class on the cluster. vSAN allows the hypervisors in a cluster to form a virtual SAN with their resources, reducing cost and keeping storage nearby. vSAN is enabled with a CSI Driver.

NFS#

NFS is currently provisioned as part of the landlord. Our current implementation of NFS is supported by the NFS CSI Driver.

Object Storage#

Object storage is not currently supported in our On-Premesis EKS-A cluster.

Logging#

Splunk#

There is a daemonset of Filebeat running on all the nodes collecting logs from the systems and pods and forwarding them into Splunk.

DVLP cluster#

PPRD cluster#

PROD cluster#

Accessing the clusters through platform-gitlab.cc.vt.edu#

On-prem clusters are provisioned and managed through platform-gitlab.cc.vt.edu.

  1. SSH to platform-gitlab.cc.vt.edu.
    ssh <PID>@platform-gitlab.cc.vt.edu
    
  2. Switch to the gitlab-runner user and enter your JWT token or PID password and 2FA
    sudo -u gitlab-runner -i
    
  3. Cluster configurations are kept at /apps/eksa/[dvlp|pprd|prod]/[dvlp|pprd|prod]-workload/[dvlp|pprd|prod]-workload-eks-a-cluster.kubeconfig. For example, to load the kubeconfig for the DVLP cluster:
    export KUBECONFIG=/apps/eksa/dvlp/dvlp-workload/dvlp-workload-eks-a-cluster.kubeconfig