Autopilot enables automated workflows for managing Raft clusters. The current feature set includes 3 main features: Server Stabilization, Dead Server Cleanup and State API. These three features are introduced in Vault 1.7.
Server stabilization helps to retain the stability of the Raft cluster by safely
joining new voting nodes to the cluster. When a new voter node is joined to an
existing cluster, autopilot adds it as a non-voter instead, and waits for a
pre-configured amount of time to monitor it's health. If the node remains to be
healthy for the entire duration of stabilization, then that node will be
promoted as a voter. The server stabilization period can be tuned using
server_stabilization_time (see below).
Dead server cleanup automatically removes nodes deemed unhealthy from the
Raft cluster, avoiding the manual operator intervention. This feature can be
tuned using the
min_quorum (see below).
State API provides detailed information about all the nodes in the Raft cluster in a single call. This API can be used for monitoring for cluster health.
Follower node health is determined by 2 factors.
- Its ability to heartbeat to leader node at regular intervals. Tuned using
- Its ability to keep up with data replication from the leader node. Tuned using
By default, Autopilot will be enabled with clusters using Vault 1.7+, although dead server cleanup is not enabled by default. Upgrade of Raft clusters deployed with older versions of Vault will also transition to use Autopilot automatically.
Autopilot exposes a configuration API to manage its behavior. Autopilot gets initialized with the following default values.
min_quorum- This doesn't default to anything and will need to be set to at least 3 when
cleanup_dead_serversis set as
Performance secondary clusters have their own Autopilot configuration, managed independently of their primary.
DR secondary clusters will also have their own Autopilot configuration (starting in Vault 1.8.0), managed independently of their primary. The Autopilot API uses DR operation tokens for authorization.
Refer to Integrated Storage Autopilot for a step-by-step tutorial.