Sizing guidelines
The Consul server nodes process all read and write operations from the agents as well as maintain a consensus among the cluster, and as such are I/O bound for writes and CPU bound for reads. This needs to be taken into consideration and monitoring put in place to adjust as required depending on the type of workload inside of the cluster.
Workloads on virtual machines require Consul client agents for service discovery and service mesh. So the following guidelines are based on that requirement.
As a general rule, we recommend that the maximum size for a single datacenter is 5,000 Consul client agents. This estimate is based on impact of recovery time, write and read requests, and other factors. We recommend deploying read replicas for improved scalability in clusters that are read-heavy. We have customers who have scaled Consul to tens of thousands of agents per cluster, but it is highly dependent on the read and write workloads of the cluster. As such, customers must optimize for stability at the gossip layer as the cluster scales. The two main factors that affect this with client agent are:
- Total size of the gossip pool
- The churn of nodes/agents in the pool
Control plane on VMs
We recommend deploying at a minimum the following types of instances for the Consul Servers. These are broken down into Initial and Large clusters. We recommend starting with the Initial cluster size and once adoption occurs, vertically scaling the servers to the Production Cluster size.
Size | Potential Instance Type | CPU | Memory | Disk Capacity | Disk IO |
---|---|---|---|---|---|
Initial | m5.large | 2 | 8 | min: 100 GB (gp3) | min: 3000 IOPS |
Small | m5.xlarge | 4 | 16 | min: 100 GB (gp3) | min: 3000 IOPS |
Large | m5.2xlarge | 8 | 32 | min: 200 GB (gp3) | min: 7500 IOPS |
Extra-Large | m5.4xlarge | 16 | 64 | min: 200 GB (gp3) | min: 7500 IOPS |
The above architecture will support a high level of agents based clients, but we highly recommend that if a single datacenter in the above architecture is provisioned, that customers monitor cluster metrics to both establish a baseline and set threshold levels.
Control plane on Kubernetes
Use the CPU and memory recommendations to set resource limits for the Consul pods, and apply the disk recommendations when configuring persistent volumes. Both limits and requests should be set in the Helm chart. Below is an example Helm configuration snippet for deploying a Consul server in a large environment.
server:
resources: |
requests:
memory: "32Gi"
cpu: "8"
limits:
memory: "32Gi"
cpu: "8"
storage: 200Gi
HashiCorp recommends monitoring your production deployment to take data-driven informed decisions to scale your production server resource limits or vertically scale the VM deployments.