Consul
xDS load balancing in Consul
⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️
[!IMPORTANT] Documentation Update: Product documentation previously located in
/websitehas moved to thehashicorp/web-unified-docsrepository, where all product documentation is now centralized. Please make contributions directly toweb-unified-docs, since changes to/websitein this repository will not appear on developer.hashicorp.com. ⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️
This page explains how enable_xds_load_balancing distributes xDS streams across Consul servers, when to enable it, and when to leave it disabled.
Overview
In a service mesh deployment, each sidecar Envoy (or Consul dataplane) maintains a long-lived xDS stream to the Consul control plane. In multi-server clusters, these streams should be spread across healthy servers so one server does not become a bottleneck.
When enable_xds_load_balancing is enabled, each server enforces a stream limit based on cluster size and connected proxy count. If the server is above its limit, it rejects new streams so clients reconnect to another server.
Why this matters
Without xDS load balancing, one server can receive a disproportionate number of streams. This can cause:
- Uneven resource utilization across servers.
- Higher update latency for impacted proxies.
- Larger blast radius if the overloaded server fails.
With xDS load balancing enabled, stream distribution remains closer to even during steady-state operations and cluster changes.
Configure enable_xds_load_balancing
The default value is false. You must explicitly set it to true to enable this behavior.
VM and bare metal servers
performance {
enable_xds_load_balancing = true
}
To disable, either omit the field or set:
performance {
enable_xds_load_balancing = false
}
Kubernetes (Helm)
Set the value through server.extraConfig:
server:
extraConfig: |
{
"performance": {
"enable_xds_load_balancing": true
}
}
How balancing works
Each server computes a maximum stream count from:
- Total number of mesh proxies in the catalog.
- Number of healthy Consul servers.
- A 10% buffer to reduce churn during normal fluctuations.
If a new xDS connection arrives when the server is over its limit, the server returns RESOURCE_EXHAUSTED and asks the client to try another server.
The limit updates automatically when:
- Proxy services are added or removed.
- Servers join, leave, or become unhealthy.
Deployment guidance
Recommended: Consul dataplane with server discovery
When dataplanes discover servers dynamically, they can retry another server when they receive a load balancing rejection. In this model, enable xDS load balancing.
External layer-4 load balancer in front of Consul servers
If you front servers with an external load balancer (NLB, HAProxy, Envoy Gateway), keep enable_xds_load_balancing = false and let the load balancer distribute connections.
Single static server address (not recommended)
If all dataplanes use a single server address and no load balancer, enabling xDS load balancing can prevent many dataplanes from establishing streams after the first server reaches its limit. Use discovery or provide multiple server addresses.
Verify the setting
Use the agent self endpoint:
$ curl -s http://localhost:8500/v1/agent/self | jq '.DebugConfig.EnableXDSLoadBalancing'
Monitor xDS balancing
Track the following metrics:
consul.xds.server.streams: Active xDS streams on this server.consul.xds.server.idealStreamsMax: Calculated stream limit for this server.consul.xds.server.streamDrained: Streams gracefully drained for redistribution.
$ curl -s http://localhost:8500/v1/agent/metrics?format=prometheus | grep xds