Consul
Autopilot Operator HTTP API
The /operator/autopilot
endpoints allow for automatic operator-friendly
management of Consul servers including cleanup of dead servers, monitoring
the state of the Raft cluster, and stable server introduction.
Please check the Autopilot tutorial for more details.
Read Configuration
This endpoint retrieves its latest Autopilot configuration.
Method | Path | Produces |
---|---|---|
GET | /operator/autopilot/configuration | application/json |
The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.
Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
---|---|---|---|
NO | none | none | operator:read |
The corresponding CLI command is consul operator autopilot get-config
.
Parameters
dc
(string: "")
- Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.stale
(bool: false)
- If the cluster does not currently have a leader an error will be returned. You can use the?stale
query parameter to read the Raft configuration from any of the Consul servers.
Sample Request
$ curl \
http://127.0.0.1:8500/v1/operator/autopilot/configuration
Sample Response
{
"CleanupDeadServers": true,
"LastContactThreshold": "200ms",
"MaxTrailingLogs": 250,
"ServerStabilizationTime": "10s",
"RedundancyZoneTag": "",
"DisableUpgradeMigration": false,
"UpgradeVersionTag": "",
"CreateIndex": 4,
"ModifyIndex": 4
}
For more information about the Autopilot configuration options, see the agent configuration section.
Update Configuration
This endpoint updates the Autopilot configuration of the cluster.
Method | Path | Produces |
---|---|---|
PUT | /operator/autopilot/configuration | application/json |
The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.
Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
---|---|---|---|
NO | none | none | operator:write |
The corresponding CLI command is consul operator autopilot set-config
.
Parameters
dc
(string: "")
- Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.cas
(int: 0)
- Specifies to use a Check-And-Set operation. The update will only happen if the given index matches theModifyIndex
of the configuration at the time of writing.CleanupDeadServers
(bool: true)
- Specifies automatic removal of dead server nodes periodically and whenever a new server is added to the cluster.LastContactThreshold
(string: "200ms")
- Specifies the maximum amount of time a server can go without contact from the leader before being considered unhealthy. Must be a duration value such as10s
.MaxTrailingLogs
(int: 250)
specifies the maximum number of log entries that a server can trail the leader by before being considered unhealthy.MinQuorum
(int: 0)
- specifies the minimum number of servers needed before Autopilot can prune dead servers.ServerStabilizationTime
(string: "10s")
- Specifies the minimum amount of time a server must be stable in the 'healthy' state before being added to the cluster. Only takes effect if all servers are running Raft protocol version 3 or higher. Must be a duration value such as30s
.RedundancyZoneTag
(string: "")
- Controls the node-meta key to use when Autopilot is separating servers into zones for redundancy. Only one server in each zone can be a voting member at one time. If left blank, this feature will be disabled.DisableUpgradeMigration
(bool: false)
- Disables Autopilot's upgrade migration strategy in Consul Enterprise of waiting until enough newer-versioned servers have been added to the cluster before promoting any of them to voters.UpgradeVersionTag
(string: "")
- Controls the node-meta key to use for version info when performing upgrade migrations. If left blank, the Consul version will be used.
Sample Payload
{
"CleanupDeadServers": true,
"LastContactThreshold": "200ms",
"MaxTrailingLogs": 250,
"MinQuorum": 3,
"ServerStabilizationTime": "10s",
"RedundancyZoneTag": "",
"DisableUpgradeMigration": false,
"UpgradeVersionTag": "",
"CreateIndex": 4,
"ModifyIndex": 4
}
Read Health
This endpoint queries the health of the autopilot status.
Method | Path | Produces |
---|---|---|
GET | /operator/autopilot/health | application/json |
The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.
Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
---|---|---|---|
NO | none | none | operator:read |
Parameters
dc
(string: "")
- Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.
Sample Request
$ curl \
http://127.0.0.1:8500/v1/operator/autopilot/health
Sample response
{
"Healthy": true,
"FailureTolerance": 0,
"Servers": [
{
"ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",
"Name": "node1",
"Address": "127.0.0.1:8300",
"SerfStatus": "alive",
"Version": "0.7.4",
"Leader": true,
"LastContact": "0s",
"LastTerm": 2,
"LastIndex": 46,
"Healthy": true,
"Voter": true,
"StableSince": "2017-03-06T22:07:51Z"
},
{
"ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",
"Name": "node2",
"Address": "127.0.0.1:8205",
"SerfStatus": "alive",
"Version": "0.7.4",
"Leader": false,
"LastContact": "27.291304ms",
"LastTerm": 2,
"LastIndex": 46,
"Healthy": true,
"Voter": false,
"StableSince": "2017-03-06T22:18:26Z"
}
]
}
Healthy
is whether all the servers are currently healthy.FailureTolerance
is the number of redundant healthy servers that could be fail without causing an outage (this would be 2 in a healthy cluster of 5 servers).Servers
holds detailed health information on each server:ID
is the Raft ID of the server.Name
is the node name of the server.Address
is the address of the server.SerfStatus
is the SerfHealth check status for the server.Version
is the Consul version of the server.Leader
is whether this server is currently the leader.LastContact
is the time elapsed since this server's last contact with the leader.LastTerm
is the server's last known Raft leader term.LastIndex
is the index of the server's last committed Raft log entry.Healthy
is whether the server is healthy according to the current Autopilot configuration.Voter
is whether the server is a voting member of the Raft cluster.StableSince
is the time this server has been in its currentHealthy
state.
The HTTP status code will indicate the health of the cluster. If
Healthy
is true, then a status of 200 will be returned. IfHealthy
is false, then a status of 429 will be returned.
Read the Autopilot State
This endpoint queries the health of the autopilot status.
Method | Path | Produces |
---|---|---|
GET | /operator/autopilot/state | application/json |
The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.
Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
---|---|---|---|
NO | none | none | operator:read |
The corresponding CLI command is consul operator autopilot state
.
Parameters
dc
(string: "")
- Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.
Sample Request
$ curl \
http://127.0.0.1:8500/v1/operator/autopilot/state
Response Format
{
"Healthy": true,
"FailureTolerance": 1,
"OptimisticFailureTolerance": 4,
"Servers": {
"5e26a3af-f4fc-4104-a8bb-4da9f19cb278": {},
"10b71f14-4b08-4ae5-840c-f86d39e7d330": {},
"1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2": {},
"63783741-abd7-48a9-895a-33d01bf7cb30": {},
"6cf04fd0-7582-474f-b408-a830b5471285": {}
},
"Leader": "5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
"Voters": [
"5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
"10b71f14-4b08-4ae5-840c-f86d39e7d330",
"1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2"
],
"RedundancyZones": {
"az1": {},
"az2": {},
"az3": {}
},
"ReadReplicas": [
"63783741-abd7-48a9-895a-33d01bf7cb30",
"6cf04fd0-7582-474f-b408-a830b5471285"
],
"Upgrade": {}
}
Healthy
is whether all the servers are currently healthy.FailureTolerance
is the number of redundant healthy servers that could be fail without causing an outage (this would be 2 in a healthy cluster of 5 servers).OptimisticFailuretolerance
Enterprise is the maximum number of servers that could fail in the right order over the right period of time without causing an outage. This value is only useful when using the Redundancy Zones feature with autopilot.Servers
is a mapping of server ID to an object holding detailed information about that server. The format of the detailed info is documented in its own section.Leader
is the server ID of current leader. This value can be used as an index into theServers
object.Voters
is a list of server IDs that are voters. These values can be used as indexes into theServers
object.RedundancyZones
Enterprise is mapping of redundancy zone name to redundancy zone information. The format of the redundancy zone information is documented in its own section.ReadReplicas
Enterprise is a list of server IDs that autopilot has identified as read replicas. These will never be promoted. These values can be used as indexes into theServers
map.Upgrade
Enterprise is an object holding all the information about any ongoing automated upgrade. The format of this object is detailed in its own section.
Server Response Format
{
"ID": "1c3e3278-3f88-4a97-9f6a-1058584e8058",
"Name": "node1",
"Address": "198.18.0.1:8300",
"NodeStatus": "alive",
"Version": "1.9.0+ent",
"LastContact": "1.321ms",
"LastTerm": 4,
"LastIndex": 42,
"Healthy": true,
"StableSince": "2020-08-12T12:13:14Z",
"RedundancyZone": "az1",
"UpgradeVersion": "1.2.3",
"ReadReplica": false,
"Status": "voter",
"Meta": {
"build": "1.2.3",
"zone": "az1"
},
"NodeType": "redundancy-zone-voter"
}
ID
is the Raft ID of the server.Name
is the node name of the server.Address
is the address of the server.NodeStatus
is the SerfHealth check status for the server.Version
is the Consul version of the server.LastContact
is the time elapsed since this server's last contact with the leader.LastTerm
is the server's last known Raft leader term.LastIndex
is the index of the server's last committed Raft log entry.Healthy
is whether the server is healthy according to the current Autopilot configuration.StableSince
is the time this server has been in its currentHealthy
state.RedundancyZone
Enterprise is the name of the redundancy zone this server is within.UpgradeVersion
Enterprise is the version that will be used for automated upgrade calculations.ReadReplica
Enterprise indicates whether this server is a read replica or not.Status
indicates the current Raft status of this server. Possible values are:leader
,voter
,non-voter
, orstaging
.Meta
is the node metadata of this server. Values within this map are used for determining a server's redundancy zone and upgrade version.NodeType
is the desired type autopilot thinks this server should have. In Consul OSS the only possible value isvoter
as all present servers should having voting rights. In Consul Enterprise the possible values also includeread-replica
,zone-voter
,zone-standby
andzone-extra-voter
.zone-voter
indicates that autopilot wants this server to be the voter for a particular redundancy zone. When a zone has no voter all nodes will be typed as this until one is promoted. When that happens the other non-voters in the zone will be typed aszone-standby
. This indicates that they are currently desired to be standby servers in case the voter from the zone fails. Finally, thezone-extra-voter
status indicates that autopilot wants this server to be a voter due to a failure of all servers in another zone and that when one of the servers in that failed zone are restored, this server will be demoted.
Redundancy Zone Response Format Enterprise
{
"Servers": [
"10b71f14-4b08-4ae5-840c-f86d39e7d330",
"b007061c-6d15-4c90-b3d6-2fef276a0650"
],
"Voters": ["b007061c-6d15-4c90-b3d6-2fef276a0650"],
"FailureTolerance": 1
}
Each zone in the responses RedundancyZones
mapping will have this structure.
Servers
is a list of server IDs of all the servers in this zone. These values can be used as indexes into the top level response'sServers
mapping.Voters
is a list of server IDs of all servers in this zone that have voting rights. Typically this will be a list with 1 value but in some failure scenarios or upgrade scenarios the size could increase. These values can be used as indexes into the top level response'sServers
mapping.FailureTolerance
is the number of servers in this zone that could fail without causing a total zone failure and subsequent promotion of a server from another zone as a fallback.
Upgrade Information Response Format Enterprise
{
"Status": "awaiting-new-servers",
"TargetVersion": "1.9.1+ent",
"TargetVersionVoters": ["f0344689-3e1f-4125-b55d-e888d3abf514"],
"TargetVersionNonVoters": [
"619a4ba6-1a0b-476e-8a1a-28aeee7735a2",
"fd683fe6-541f-4ebf-bc5a-6eae51571ddb"
],
"TargetVersionReadReplicas": ["9f1e27ae-1129-45ef-97dd-6d8c3ec47e6a"],
"OtherVersionVoters": [
"0cbdd493-235f-48f2-98d9-1bf2443b9d72",
"21812bd7-2f21-4565-9892-2fdd3d4e1a99",
"c654ba5c-cc76-4056-a5ca-6e78d95f27ad"
],
"OtherVersionNonVoters": [
"6d973f11-6bdb-4f7d-8a90-c1300066da4c",
"6241ab45-371e-4b2a-a0f1-d847c3b7b1b0"
],
"OtherVersionReadReplicas": ["42d10fc3-581b-4403-832d-945b3a0d8841"]
}
Status
is the automated upgrade status. Possible values are:disabled
indicates that automated upgrades are disabled either from user configuration or due to being unlicensed.idle
indicates that there is no ongoing upgrade and that all servers are running the same Consul version.await-new-voters
indicates that a newer versioned server has been added but that autopilot is waiting for more servers of that version to be added before proceeding with the upgrade.promoting
indicates that enough servers of the target version have been added and autopilot will now promote them to voters.demoting
indicates that autopilot is currently demoting the servers not running the target version.leader-transfer
indicates that autopilot is in the process of transferring leadership to a server running the target version.await-new-servers
indicates that the majority of the upgrade is complete but that more servers running the target version need to be added to completely replace all of the previous servers.await-server-removal
indicates that the upgrade is complete and it is now safe to remove the previous servers.
TargetVersion
is the version that Autopilot is upgrading to. This will be the maximum version of all serversUpgradeVersion
field in the top levelServers
mapping.TargetVersionVoters
is a list of IDs of servers running the target version and that currently have voting rights.TargetVersionNonVoters
is a list of IDs of servers running the target version and that currently do not have voting rights.TargetVersionReadReplicas
is a list of IDs of servers running the target version and are read replicas.OtherVersionVoters
is a list of IDs of servers not running the target version and that currently have voting rights.OtherVersionNonVoters
is a list of IDs of servers not running the target version and that currently do not have voting rights.OtherVersionReadReplicas
is a list of IDs of servers not running the target version and are read replicas.