The purpose of this tutorial is NOT to walk you through the storage migration steps. This guide provides a quick self-check whether it is your best interest to migrate the Vault storage from an external system to the integrated storage.
You should read this guide if you are currently running a Vault environment backed by an external system such as HashiCorp Consul to persist the Vault's encrypted data, and considering to migrate to the Vault's Integrated Storage.
The Integrated Storage is an additional storage option made available in Vault 1.4 and not a requirement. Vault continues to support external storage that is currently supported (e.g. Consul).
- Understand the architectural differences
- Consul vs. Integrated Storage
- Self-check questions
It is important to understand the differences between the Vault cluster with external storage backend and the cluster using the Integrated Storage.
The recommended number of Vault instances is 3 in a cluster which connects to a Consul cluster which may have 5 or more nodes as shown in the diagram below. (Total of 8 virtual machines to host a Vault HA environment.)
The processing requirements depend on the encryption and messaging workloads. Memory requirements are dependant on the total size of secrets stored in the memory. The Vault server itself has minimal storage requirements but the Consul nodes should have a relatively high-performance hard disk system.
The recommended number of Vault instances is 5 in a cluster. In a single HA cluster, all Vault nodes share the data while an active node holds the lock; therefore, only the active node has write access. To achieve n-2 redundancy, (meaning that the cluster can still function after losing 2 nodes), an ideal size for a Vault HA cluster is 5 nodes.
Refer to the Integrated Storage documentation.
Because the data gets persisted on the same host, the Vault server should be hosted on a relatively high-performance hard disk system.
The Integrated Storage eliminates the need for external storage; therefore, Vault is the only software you need to stand up a cluster. This indicates that the host machine must have disk capacity in an amount equal or greater to that of the existing external storage backend.
The fundamental difference between Vault's Integrated Storage and Consul is that the Integrated Storage stores everything on disk while Consul KV stores everything in its memory which impacts the host's RAM.
It is recommended to avoid hosting Consul on an instance with burstable CPU.
|Size||CPU||Memory||Disk||Typical Cloud Instance Types|
|Small||2 core||4-8 GB RAM||25 GB||AWS: m5.large|
|GCE: n1-standard-2, n1-standard-4|
|Large||4-8 core||16-32 GB RAM||50 GB||AWS: m5.xlarge, m5.2xlarge|
|Azure: Standard_D4_v3, Standard_D8_v3|
|GCE: n1-standard-8, n1-standard-16|
|Size||CPU||Memory||Disk||Typical Cloud Instance Types|
|Small||2 core||8-16 GB RAM||100 GB||AWS: m5.large, m5.xlarge|
|Azure: Standard_D2_v3, Standard_D4_v3|
|GCE: n2-standard-2, n2-standard-4|
|Large||4-8 core||32-64 GB RAM||200 GB||AWS: m5.2xlarge, m5.4xlarge|
|Azure: Standard_D8_v3, Standard_D16_v3|
|GCE: n2-standard-8, n2-standard-16|
If many secrets are being generated or rotated frequently, this information will need to be flushed to the disk often. Therefore, the infrastructure should have a relatively high-performance hard disk system when using the integrated storage.
Vault's Integrated Storage is disk-bound; therefore, care should be taken when planning storage volume size and performance. For cloud providers, IOPS can be dependent on volume size and/or provisioned IOPS. It is recommended to provision IOPS and avoid burstable IOPS. Monitoring of IOPS performance should be implemented in order to tune the storage volume to the IOPS load.
Because Consul KV is memory-bound, it is necessary to take a snapshot frequently. However, Vault's Integrated Storage persists everything on the disk which eliminates the need for such frequent snapshot operations. Take snapshots to back up the data so that you can restore them in case of data loss. This reduces the performance cost introduced by the frequent snapshot operations.
In considering disk performance, since Vault data changes are immediately written to disk, rather than in batched snapshots as Consul does, it is important to monitor IOPS as well as disk queues to limit storage bottlenecks.
Inspection of Vault data differs considerably from the
consul kv commands used
to inspect Consul's KV store.
Consult the Inspect Data in Integrated Storage
tutorial to query Vault's Integrated Storage data.
The table below highlights the differences between Consul and integrated storage.
|Consideration||Consul as storage backend||Vault Integrated Storage|
|System requirement||Memory optimized machine||Storage optimized high IOPS machine|
|Data snapshot||Frequent snapshots||Normal data backup strategy|
|Snapshot automation||Snapshot agent (Consul Enterprise only)||Automatic snapshot (Vault Enterprise v1.6.0 and later)|
|Data inspection||Online, use ||Offline, requires using recovery mode|
|Autopilot||Supported||Supported (Vault 1.7.0 and later)|
- Where is the product expertise?
- Do you already have Consul expertise?
- Are you concerned about lack of Consul knowledge?
- Do you currently experience any technical issue with Consul?
- What motivates the data migration from the current storage backend to the Integrated Storage?
- Reduce the operational overhead?
- Reduce the number of machines to run?
- Reduce the cloud infrastructure cost?
- Do you have a staging environment where you can run production loads and verify that everything works as you expect?
- Have you thought through the storage backup process or workflow after migrating to the Integrated Storage?
- Do you currently rely heavily on using Consul to inspect Vault data?
If you are ready to migrate the current storage backend to Integrated Storage, refer to the Storage Migration Tutorial - Consul to Integrated Storage.
To deploy a new cluster with Integrated Storage, refer to the Vault HA Cluster with Integrated Storage tutorial.