Scaling guide for Terraform Enterprise on Kubernetes
This document describes how to scale Terraform Enterprise Flexible Deployment Option with kubernetes runtime.
Kubernetes cluster hardware requirements
The hardware requirements for your Kubernetes cluster are dependant on the specific workloads it will host. These requirements should be tailored to match the resource needs of your applications, considering factors such as CPU, RAM, storage, and network demands.
Terraform Enterprise pod requirements
Hardware sizing for terraform-enterprise pod
The minimum size would be appropriate for most initial production deployments, or for development/testing environments.
CPU | Memory |
---|---|
2 core | 3 GB RAM |
We tested with a minimum memory of 2.5GB and 0.75 vCPU and scaled up to maximum of 7.5GB and 4 vCPU.
resources: requests: memory: "2500Mi" cpu: "750m" limits: memory: "7500Mi" cpu: "4000m"
Sizing for PostgreSQL
The PostgreSQL Database should be sized according to the anticipated scale of the Terraform Enterprise deployment. A minimum of 16GB memory and 4 vCPU is recommended for most installations.
This sizing is the same as the one used for Terraform Enterprise-Replicated installations.
Type | CPU | Memory | Storage |
---|---|---|---|
Minimum | 4 core | 16 GB RAM | 50 GB |
Scaling | 8 core | 32 GB RAM | 50 GB |
Sizing for Redis cache
We tested with a minimum of 6GB Redis Cache with Multi-AZ enabled and Automated Failover. This sizing is the same as the one used for TFE-Replicated installations.
Scaling Terraform Enterprise
To handle increased computational workload requirements of your organization, you can scale Terraform Enterprise horizontally by increasing the number of terraform-enterprise pods and/or terraform-enterprise agent pods.
How to scale terraform-enterprise pods
When Terraform Enterprise is deployed on the Kubernetes cluster, a deployment object is created which manages the terraform-enterprise pods through a ReplicaSet.
The replicas
field in the spec
section of deployment.yaml controls the desired number of terraform-enterprise pods.
apiVersion: apps/v1kind: Deploymentmetadata: name: terraform-enterprise labels: app: terraform-enterprisespec: replicas: {{ .Values.replicaCount }} selector: matchLabels: app: terraform-enterprise template: metadata: annotations: {{- if .Values.pod.annotations }} {{ toYaml .Values.pod.annotations | indent 8 }} {{ end }} labels: app: terraform-enterprise
To scale the pods, update the value of replicaCount
on the values file on your Helm chart. This will in turn update the number of replicas on spec.replicas
section on the deployment file.
replicaCount: 1 ...
Afterwards, do a helm upgrade
to update the deployment.
The maximum number of terraform-enterprise pods supported is 5.
How to scale Terraform Enterprise runs
Terraform Enterprise executes runs by creating agent jobs in a different namespace, which in turn creates agent pods. Each run executes in its own agent pod. When the run is completed, the agent job and agent pod are automatically cleaned up. Increasing the maximum number of concurrent agent pods can reduce run queue lengths, resulting in reduced wait time for runs to begin execution.
The TFE_CAPACITY_CONCURRENCY
setting determines the maximum number of agent jobs that can be created by the Terraform Enterprise pod at a given time. The default concurrency is set at 10. TFE_CAPACITY_CONCURRENCY
applies to each terraform-enterprise pod. For example, if you have three terraform-enterprise pods, and TFE_CAPACITY_CONCURRENCY
is 10, the maximum number of agent pods for Terraform Enterprise will be 30.
You can also configure memory and cpu limits for individual agent pods. The memory and cpu configuration for agent pods is determined by the TFE_CAPACITY_MEMORY
and TFE_CAPACITY_CPU
values on the helm chart.
env: ... variables: TFE_CAPACITY_CONCURRENCY: "10" # Set the maximum number of concurrent runs, eg: 10 TFE_CAPACITY_CPU: "0" # Set the maximum CPU utilization. "0" equals unlimited. TFE_CAPACITY_MEMORY: "2048" # Set the maximum memory utilization, eg: "2048" equals 2048Mi.
To increase the number of agent pods that can be run concurrently, update the TFE_CAPACITY_CONCURRENCY
value on the values file on the helm chart and do a helm upgrade
to update the deployment.
The maximum value of TFE_CAPACITY_CONCURRENCY
supported is 50.
Impact on external resources
PostgreSQL
In our testing, we observed that CPU utilization and memory usage increased when scaling Terraform Enterprise pods with increasing capacity concurrency. We recommend closely monitoring both CPU utilization and memory usage and be prepared to scale up if necessary.
Redis
In our testing, we observed Redis CPU or memory spikes in some cases. We recommend monitoring your redis server and being prepared to scale up if necessary.