Vault
Performance tuning
This tutorial focuses on tuning your Vault environment for optimal performance. Refer to Vault Limits and Maximums for known upper limits on the size of certain fields and objects, and configurable limits on others.
Vault is a high-performance secrets management and data protection solution capable of handling enterprise-scale workloads. As you scale your usage and adopt broader use cases, it could become necessary to tune Vault, its underlying operating system, and storage backend for optimal performance.
The goal here is to provide guidelines and best practices for tuning the Vault environment necessary to achieve optimal performance, not to document requirements. These are best practice recommendations that should be applied when possible and practical based on your specific environment and requirements, along with some important Vault resource limitations to consider.
Your focus will be on a limited range of tunable parameters, which are grouped as follows:
- Linux OS tuning covers critical OS configuration items for ideal operations.
- Vault tuning details the configuration tuning for Vault itself and primarily includes core configuration items.
- Storage backend tuning contains items of note which are specific to the storage backend in use.
If your aim is to use what you learn here to tune production systems, then you should first become familiar with guidance from the Reference Architecture and Deployment Guide and ensure that your Vault cluster deployment aligns with guidance there before proceeding with this tutorial.
Production Hardening is also an extremely useful resource to learn about hardening your clusters for production.
Table of contents:
- Performance investigation
- Linux OS tuning
- Vault tuning
- Storage backend tuning
- Resource limits & maximums
- Help and reference
Performance investigation
Part of performance tuning involves investigation by observation and measuring current characteristics of a system. This investigation can be facilitated through numerous methods and tools. One such methodology for analyzing the performance of a system is the Utilization Saturation and Errors (USE) method.
This method proposes a technique to use early in performance investigation that involves checking the following for each relevant resource:
- Utilization - did you get an alert about low storage capacity or notice out of memory errors, for example?
- Saturation - are there signs that the storage IOPS are at their allowed maximum, for example?
- Errors - are there errors in the application logs or Vault logs, for example? Are they persistent while performance degrades?
You can apply the USE method to Vault cluster system resources and gain an idea of existing bottlenecks or issues as part of your initial performance investigation.
Elements of this method will be used throughout the tutorial. For example, when investigating the performance of failover in a highly available cluster, certain warnings or errors (the 'E' in USE) can provide feedback on resources which could be tuned to increase performance without errors going forward.
Likewise, you can use features like telemetry to gather metrics and measure the utilization and saturation of resources in your Vault cluster.
Review Monitor Telemetry & Audit Device Log Data with Splunk to learn more about using Vault telemetry and audit device metrics in an environment based on Fluentd, Telegraf, and Splunk.
Tip
When you are able to gather, investigate, and measure data from Vault cluster environments you can also more accurately inform your performance tuning decisions.
Performance investigation tools
The USE Method provides a comprehensive checklist for Linux systems that is great for investigating system level performance, and also details tools used for investigating utilization and saturation aspects of each resource.
Some of the most common tools you can use to help with performance investigation at the physical system or virtual machine level are also listed here for your reference.
Component | Tools | Notes |
---|---|---|
CPU | dstat, htop, lscpu, sar, top, vmstat | dstat does not have a Python 3 implementation; Red Hat users can emulate dstat with Performance Co-Pilot. |
Memory | free, sar, vmstat | |
Storage | df, iostat, sar, swapon | |
Network | ifconfig, netstat |
For users in containerized and Kubernetes environments, there exist a range of higher level tools to better serve the specific troubleshooting challenges of those environments.
Some solutions in common use include:
- Sysdig Inspec is a powerful open source interface for container troubleshooting.
Linux OS tuning
When the underlying operating system is properly configured and tuned, Vault operations will benefit and issues can be prevented.
In this section you will learn about Linux OS tunable configuration for ideal Vault operations.
User limits
The Linux kernel can impose user limits (known also as ulimits or) on a per-user, per-process, or system-wide basis. These limits were historically designed to help prevent any one user or process from consuming available resources on multi-user and multi-process systems. On a contemporary Linux system these ulimits are typically controlled by systemd process properties.
For task-specific systems like Vault servers, which typically host a minimum number of running processes and no multi-user interactive sessions, the default ulimits can be too low and cause issues.
Current limits for a running vault process can always be read from the kernel process table under the relevant process ID (PID). In this example, the pidof
command is used to dynamically get the vault PID and insert it into the path to retrieve the correct values.
$ cat /proc/$(pidof vault)/limits
A successful response resembles this example output.
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 7724 7724 processes
Max open files 1024 4096 files
Max locked memory 16777216 16777216 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 7724 7724 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
The output shows the limit name and three values:
- Soft Limit is a user configurable value the kernel will enforce that cannot exceed the hard limit.
- Hard Limit is a root user configurable value the kernel will enforce that cannot exceed the system-wide limit
- Units represent the measurement type for the limit
While there are 16 distinct limits shown in the output, this tutorial focuses on 2 of them in detail: Max open files and Max processes.
Note
Be cautious when using approaches such as ulimit -a
to get user limits as the limits output from that command are for the current user and do not necessarily match those of the user ID under which your Vault or Consul processes actually execute.
Max open files
An operating Vault consumes file descriptors for both use in accessing files on a filesystem and for representing connections established to other network hosts as sockets.
The value of maximum open files allowed to the Vault process is a critical user limit that you should appropriately tune for ideal performance.
How to measure usage?
To inspect only the current maximum open files values for the vault process, read them from the kernel process table.
$ cat /proc/$(pidof vault)/limits | awk 'NR==1; /Max open files/'
A successful response includes the heading descriptions and values:
Limit Soft Limit Hard Limit Units
Max open files 1024 4096 files
To get a more verbose picture of open files, you can also use the lsof
command like this.
$ sudo lsof -p $(pidof vault)
Example output:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
vault 14810 vault cwd DIR 253,0 4096 2 /
vault 14810 vault rtd DIR 253,0 4096 2 /
vault 14810 vault txt REG 253,0 138377265 131086 /usr/local/bin/vault
vault 14810 vault 0r CHR 1,3 0t0 6 /dev/null
vault 14810 vault 1u unix 0xffff89e6347f9c00 0t0 41148 type=STREAM
vault 14810 vault 2u unix 0xffff89e6347f9c00 0t0 41148 type=STREAM
vault 14810 vault 3u unix 0xffff89e6347f8800 0t0 41208 type=DGRAM
vault 14810 vault 4u a_inode 0,13 0 9583 [eventpoll]
vault 14810 vault 6u IPv4 40467 0t0 TCP *:8200 (LISTEN)
vault 14810 vault 7u IPv4 41227 0t0 TCP localhost:53766->localhost:8500 (ESTABLISHED)
This is a minimal example taken from a newly unsealed Vault. You can expect much more output in a production Vault with several use cases. The output is helpful for spotting the specific source of open connections, such as numerous sockets to a database secrets engine, for example.
Here, you can observe that the last 2 lines are related to 2 open sockets.
First, there is file descriptor number 6 that is open with read and write permission (u), is of type IPv4, is a TCP node that is bound to port 8200 on all network interfaces.
Second, file descriptor 7 represents the same kind of socket, except as an outbound ephemeral port connection from Vault on TCP/53766 to the Consul client agent on localhost that is listening on port 8500.
What are common errors?
When the value for maximum open files is not sufficient, Vault will emit errors to its operational logging in the format of this example.
http: Accept error: accept tcp4 0.0.0.0:8200: accept4: too many open files; retrying in 1s
There are several important parts to this log line:
- Vault http subsystem is the error source (
http:
) - Since the error originates from
http
, the error also relates to exhausting file descriptors in the context of network sockets, not regular files (i.e. noteaccept4()
instead ofopen()
) - The most critical fragment of the message and one that explains the root of the immediate issue is too many open files.
This is a both red alert that there are currently insufficient file descriptors, and that something could be excessively consuming them.
You should remedy the issue by increasing the maximum open files limit and restarting the Vault service for each affected cluster peer. There are implications and limitations around raising the value that you should be aware of before doing so.
First, there is a system-wide maximum open files limit that is enforced by the kernel and cannot be exceeded by user programs like Vault. Note that this value is dynamically set at boot time and varies depending on the physical computer system characteristics, such as available physical memory.
To check the current system-wide maximum open files value for a given system, read it from the kernel process table.
$ cat /proc/sys/fs/file-max
A successful response includes only the raw value:
197073
On this example system, it will not be possible to specify a maximum open file limit that exceeds 197073.
Increase limits
In the case of the previous example output, you observed that the maximum open files for the Vault process had a soft limit of 1024 and a hard limit of 4096. These are often the default values for some Linux distributions and you should always increase the value beyond such defaults for using Vault in production.
Once you have determined the system-wide limit, you can appropriately increase the limit for Vault processes. With a contemporary systemd based Linux, you can do so by editing the Vault systemd service unit file, and specifying a value for the LimitNOFILE process property.
The systemd unit file name can vary, but often it is vault.service
, and located at the path /etc/systemd/system/vault.service
.
Edit the file as the system super user.
$ sudo $EDITOR /etc/systemd/system/vault.service
Then either add the LimitNOFILE process property under [Service]
or edit its value if it already exists so that both the soft and hard limits are increased to a reasonable baseline value of 65536.
LimitNOFILE=65536
Save the file, exit your editor.
Any change to the unit requires a daemon reload; go ahead and do that now.
$ sudo systemctl daemon-reload
A successful response should include no output.
The next time the vault service is restarted, the new maximum open files limits will be in effect.
You can restart the service, then examine the process table again to confirm your changes are in place.
Note
You should be careful about this step in production systems as it can trigger a cluster leadership change. Depending on your Vault seal type, restarting the service could mean that you also need to unseal Vault if not using an auto seal type, so be prepared to do so if that is your case.
First, restart the vault service.
$ sudo systemctl restart vault
Once restart successfully completes, check the process table for the new vault process.
$ cat /proc/$(pidof vault)/limits | awk 'NR==1; /Max open files/'
A successful response should include the updated values:
Limit Soft Limit Hard Limit Units
Max open files 65536 65536 files
Tip
For an example Vault systemd unit file that also includes this process property, refer to enable and start the service in the Vault Deployment Guide.
A note about CPU scaling
There can be an expectation that Vault will scale linearly up to 100% CPU usage when tuning specific workloads, such as the Transit or Transform Secrets engine encryption, but this is typically unrealistic.
Part of the reason for this relates to the performance of Go, the programming language that Vault is written in. In Go, there is a notion of goroutines, which are functions or methods that run concurrently with other functions or methods. The more goroutines that are scheduled at once, the more context switching has to be performed by the system, the more interrupts will be sent by the network card, and so on.
This behavior may not represent a substantial toll on the CPU in terms of real CPU utilization, but it can impair I/O because each time a goroutine blocks for I/O (or is preempted due to an interrupt) it can be longer each time before that goroutine gets back into service.
You should keep this in mind whenever tuning CPU heavy workloads in Vault.
Vault tuning
The following sections relate to tuning of the Vault software itself through the use of available configuration parameters, features, or functionality.
Where possible, guidance is given and examples are provided.
Cache size
Vault uses a Least Recently Used (LRU) read cache for the physical storage subsystem with a tunable value, cache_size. The value is the number of entries and the default value is 131072.
The total cache size depends on the size of stored entries.
Note
LIST operations are not cached.
Maximum request duration
Vault provides two parameters you can tune that will limit the maximum allowed duration of a request for use cases with strict durations or service level agreements around the duration of requests or other needs for enforcing a request duration of specific length.
At the server-wide level, there is default_max_request_duration with a default value of 90 seconds (90s). Again, tuning of this value is for very specific use cases and affects every request made against the entire node, so do keep this in mind.
Here is an example minimal Vault configuration that shows the use of an explicit default_max_request_duration
setting.
api_addr = "https://127.0.0.8200"
default_max_request_duration = "30s"
listener "tcp" {
address = "127.0.0.1:8200"
tls_cert_file = "/etc/pki/vault-server.crt"
tls_key_file = "/etc/pki/vault-server.key"
}
storage "consul" {
address = "127.0.0.1:8500"
path = "vault"
}
The second option is to set a similar maximum at the listener level. Vault allows for multiple TCP listeners to be configured. To gain some granularity on the request restriction, you can set max_request_duration within the scope of the listener
stanza. The default value is also 90 seconds (90s).
Here is an example minimal Vault configuration that shows the use of an explicit max_request_duration
setting in the TCP listener.
api_addr = "https://127.0.0.8200"
listener "tcp" {
address = "127.0.0.1:8200"
tls_cert_file = "/etc/pki/vault-server.crt"
tls_key_file = "/etc/pki/vault-server.key"
max_request_duration = "15s"
}
storage "consul" {
address = "127.0.0.1:8500"
path = "vault"
}
Note
When you set max_request_duration in the TCP listener stanza, the value overrides that of default_max_request_duration.
Maximum request size
Vault enables control of the global hard maximum allowed request size in bytes on a listener through the max_request_size parameter.
The default value is 33554432 bytes (32 MB).
Specifying a number less than or equal to 0 turns off request size limiting altogether.
HTTP timeouts
Each Vault TCP listener can define four HTTP timeouts, which directly map to underlying Go http server parameters as defined in Package http.
http_idle_timeout
The http_idle_timeout parameter is used to configure the maximum amount of time to wait for the next request when keep-alives are enabled. If the value of this parameter is 0, the value of http_read_timeout is used. If both have a 0 value, there is no timeout.
Default value: 5m (5 minutes)
http_read_header_timeout
The http_read_header_timeout parameter is used to configure the amount of time allowed to read request headers. If the value of http_read_header_timeout is 0, the value of http_read_timeout is used. If both are 0, there is no timeout.
Default value: 10s (10 seconds)
http_read_timeout
The http_read_timeout parameter is used to configure the maximum duration for reading the entire HTTP request, including the body.
Default value: 30s (30 seconds)
http_write_timeout
The http_write_timeout parameter is used to configure the maximum duration before timing out writes of the response.
Default value: 0 (zero)
Lease expiration and TTL values
Vault maintains leases for all dynamic secrets and service type authentication tokens.
These leases represent a commitment to do future work in the form of revocation, which involves connecting to external hosts to revoke the credential there as well. In addition, Vault has internal housekeeping to perform in the form of deleting (potentially recursively) expired tokens and leases.
It is important to keep the growth of leases in a production Vault cluster in check. Unbounded lease growth can eventually cause serious issues with the underlying storage backend, and eventually to Vault itself.
By default, Vault will use a time-to-live (TTL) value of 32 days on all leases. You need to be aware of this when defining use cases and try to select the shortest possible TTL value that your use can tolerate.
Note
If you deploy Vault use cases without specifying explicit TTL and maximum TTL values, you run the risk of generating excessive leases as the long default lifetime allows them to rapidly accumulate, especially when doing bulk or load generation and testing. This is a common pitfall with new Vault users. Review Token Time-To-Live, Periodic Tokens, and Explicit Max TTLs to learn more.
Short TTLs are good
Good for security
- A leaked token with a short lease is likely already expired.
- A failed or destroyed service instance whose token is not revoked immediately is not a big deal if it will expire shortly.
Good for performance
Short TTLs have a load smoothing effect. It is better to have a lot of small writes spaced out over time, than having a big backlog of expirations all at once.
What to look for?
With respect to usage and saturation, you can identify issues by monitoring the vault.expire.num_leases metric, which represents the number of all leases which are eligible for eventual expiry.
You can also monitor storage capacity for signs of lease saturation. Specifically you can examine the paths in storage which hold leases. Review the Inspecting Data in Consul Storage or Inspect Data in Integrated Storage tutorials to learn more about the paths where you can expect to find lease data.
Namespaces
Note
Namespaces are a Vault Enterprise Platform feature.
The hierarchy of namespaces is purely logical and internal routing is handled only at one level. As a result, there are not any performance considerations or general limitations for the use of namespaces themselves whether implemented as flat hierarchies or in a deeply nested configuration.
Performance Standbys
Note
Performance Standbys are a feature of Vault Enterprise with the Multi-Datacenter & Scale Module.
Vault Enterprise offers additional features that allow High Availability servers to service requests that do not modify Vault's storage (read-only requests) on the local standby node versus forwarding them to the active node. Such standby servers are known as Performance Standbys, and are enabled by default in Vault Enterprise. Read the Performance Standby Nodes tutorial to learn more.
While there are currently no tunable parameters available for performance standby functionality, some use cases can require that they be entirely disabled. If necessary, you can disable the use of performance standbys with the disable_performance_standby configuration parameter.
Replication
Vault enterprise replication uses a component called the log shipper to track recently written updates to Vault storage and stream them to replication secondaries.
Vault version 1.7 introduced new performance related configuration for Enterprise Replication functionality.
If you are a Vault Enterprise user with version 1.7 or higher, use the information in this section to understand and adjust the replication performance configuration for your use case and workload.
Tuning the replication configuration is most useful when replicating large numbers (thousands to tens of thousands) of items such as enterprise namespaces, particularly if the namespaces are frequently created and deleted.
You can tune both the length and size of the log shipper buffer to make the most use of available system resources, while also preventing unbounded buffer growth.
The configuration is contained within a replication
stanza that should be located in the global configuration scope. Here is an example configuration snippet containing all available options for the replication
stanza.
replication {
resolver_discover_servers = true
logshipper_buffer_length = 1000
logshipper_buffer_size = "5gb"
}
Detailed information about each configuration option follows.
resolver_discover_servers
controls whether the log shipper's resolver should discover other Vault servers; the option accepts a boolean value, and the default value is true;logshipper_buffer_length
sets the maximum number of entries that the log shipper buffer holds as an integer value; the default value is zero (0). In the example configuration, the value is set to 1000 entries.logshipper_buffer_size
sets the maximum size that the log shipper buffer can grow to, expressed as an integer indicating the number of bytes or as a capacity string. Valid capacity strings arekb, kib, mb, mib, gb, gib, tb, tib
; there is no default value. In the example configuration, the value is set to 5 gigabytes.
If you do not explicitly define values for logshipper_buffer_length
or logshipper_buffer_size
, then Vault calculates default values based on available memory.
On startup, Vault attempts to access the amount of host memory, if it is successful, it allocates 10% of the available memory to the log shipper. For example, if your Vault server has 16GB of memory, the log shipper will have access to 1.6GB.
If Vault fails to read the host memory, a default value of 1GB is used for logshipper_buffer_size
.
Tip
Refer to Vault Limits and Maximums to learn more about specific limits and maximum sizes for Vault resources.
What to look for?
Observe memory utilization for the Vault processes; if you replicate many enterprise namespaces, and memory is not successfully released upon deletion of namespaces, you should investigate.
You can then decide whether to implement changes to the replication configuration that match your available server memory resources and namespace usage based on your investigation of current memory usage behavior.
How to improve performance?
You must first ensure that your Vault servers meet the requirements outlined in the Reference Architecture. Tuning these configuration values requires that the underlying memory resources are present on each server in the Vault cluster.
If you intend to increase memory resources in your Vault servers, you can then increase the logshipper_buffer_size
value accordingly.
You can adjust the logshipper_buffer_length
value to handle anticipated increases in namespace usage. For example, if your deployment currently uses several hundred namespaces, but your plans are to soon expand to 3000 namespaces, then you should increase logshipper_buffer_length
to meet this increase.
Heads up
Please keep in mind that the practical limit for enterprise namespaces in a single cluster is dependent on the storage type in use. Current limits are explained in the Namespace limits section of the Vault Limits and Maximums documentation.
PKI certificates & Certificate Revocation Lists
Users of the PKI Secrets Engine, should be aware of the performance considerations and best practices specific to this secrets engine.
One thing to consider If you are aiming for maximum performance with this secrets engine: you will be bound by available entropy on the Vault server and the high CPU requirements for computing key pairs if your use case has Vault issuing the certificate and private key instead of signing Certificate Signing Requests (CSR).
This can easily cause fairly linear scaling. There some ways to avoid this but the most general-purpose way is is to have clients generate CSRs and submit them to Vault for signing instead of having Vault return a certificate/key pair.
Two of the most common performance pitfalls users encounter with the PKI secrets engine are interrelated, and can result in severe performance issues up to and including outage in the most extreme cases.
The first problem is in choosing unrealistically long certificate lifetimes.
Vault champions a philosophy of keeping all secret lifetimes as short as practically possible. While this is fantastic for security posture, it can add a bit of challenge to selecting the ideal certificate expiration values.
It is still critical that you reason about each use case thoroughly and work out the ideal shortest lifetimes for your Vault secrets, including PKI certificates generated by Vault. Review the PKI secrets engine documentation, especially the section Keep certificate lifetimes short, for CRL's sake to learn more.
Tip
If your certificate lifetimes are somewhat longer than required, it is critical that you ensure that applications are reusing the certificates they get from Vault until they near expiry before requesting new ones, and are not frequently requesting new ones on a regular basis. Long lived certificates that are generated frequently will cause rapid CRL growth.
The second issue is driven by the first, in that creation of numerous certificates with long lifetimes will cause rapid growth of the Certificate Revocation List (CRL). Internally this list is represented as one key in the key/value store. If your Vault servers use the Consul storage backend, it ships with a default maximum value size of 512KB, and the CRL can easily saturate this value in time with enough improper usage and frequent requesting of long lived certificates.
What are common errors?
When the PKI secrets engine CRL has grown to be larger than allowed by the default Consul key value maximum size, you can expect to encounter errors about lease revocation in the Vault operational log that resemble this example:
[ERROR] expiration: failed to revoke lease: lease_id=pki/issue/prod/7XXYS4FkmFq8PO05En6rvm6m error="failed to revoke entry: resp: (*logical.Response)(nil) err: error encountered during CRL building: error storing CRL: Failed request: Request body too large, max size: 524288 bytes"
If you are trying to gain increased performance with the PKI secrets engine and do not require a CRL, you should define your roles to use the no_store parameter.
Note
Certificates generated from roles that define the no_store parameter cannot be enumerated or revoked by Vault.
ACLs in policies
If your goal is to optimize Vault performance as much as possible, you should analyze your ACLs and policy paths with an aim to minimize the complexity of paths that use templating and special operators.
How to improve performance?
- Try to minimize use of templating in policy paths when possible
- Try to minimize use of the
+
and*
path segment designators in your policy path syntax.
Policy Evaluation
Vault Enterprise users can have Access Control List (ACL) policies, Endpoint Governing Policies (EGP), and Role Governing Policies (RGP) in use.
For your reference, here is a diagram and description of the Vault policy evaluation process for ACL, EGP, and RGP.
If the request was an unauthenticated request (e.g. "vault login"), there is no token; therefore, Vault evaluates EGPs associated with the request endpoint.
If the request has a token, the ACL policies attached to the token get evaluated. If the token has an appropriate capability to operate on the path, RGPs will be evaluated next.
Finally, EGPs set on the request endpoint will be evaluated.
If at any point, the policy evaluation fails, then the request will be denied.
Sentinel policies
Enterprise users of Vault Sentinel policies should be aware that these policies are generally more computationally intensive by nature.
What are the performance implications of Sentinel policies?
- Generally, the more complex a policy and the more that it pertains to a specific request, the more expensive it will be.
- Templated policy paths also add additional cost to the policy as well.
- A larger number of Sentinel policies that apply to specific requests will have more performance impact than a similar number of policies which are not as specific about the request.
The new HTTP import introduced in Vault version 1.5 provides a flexible means of policy workflow to leverage external HTTP endpoints. If you use this module, you should be aware that in addition to the internal latency involved in processing the logic for the Sentinel policy, there is now an external latency and these two must be combined to properly reason about the overall performance.
Tokens
Tokens are required for all authenticated Vault requests, which comprise the majority of endpoints.
They typically have a finite lifetime in the form of a lease or time-to-live (TTL) value.
The common interactions for tokens involve login requests and revocation. Those interactions with Vault result in the following operations.
Interaction | Vault operations |
---|---|
Login request | Write new token to the Token Store Write new lease to the Lease Store |
Revoke token (or token expiration) | Delete token Delete token lease Delete all child tokens and leases |
Batch tokens are encrypted blobs that carry enough information for them to be used for Vault actions, but require no storage on disk like service tokens.
There are some trade-offs to be aware of when using batch tokens and you should use them with care.
Less secure than service tokens
- Batch tokens cannot be revoked or renewed.
- The TTL value must be set in advance, and is often set higher than ideal as a result.
Better performing
- Batch tokens are amazingly inexpensive to use since they do not touch the disk.
- They are often an acceptable trade-off when the alternative is unmanageable login request rates.
Seal Wrap
Note
Seal Wrap is a feature of Vault Enterprise with Governance & Policy Module.
When integrating Vault Enterprise with HSM, seal wrapping is always enabled with a supported seal. This includes the recovery key, any stored key shares, the root key (previously known as master key), the keyring, and more- essentially, any critical security parameter (CSP) within the Vault core.
Anything that is seal-wrapped is going to be considerably slower to read and write since the requests will leverage the HSM encryption and decryption. In general, communicating to the HSM adds latency that you will need to factor into overall performance.
This applies even to cached items since Vault caches the encrypted data; therefore, even if the read from storage is free, the request still needs to talk to the seal to use the data.
Storage backend tuning
Vault request latency is primarily limited by the configured storage backend and storage writes are much more expensive than reads.
The majority of Vault write operations relate to these events:
- Logins and token creation
- Dynamic secret creation
- Renewals
- Revocations
There are a number of similar tunable parameters for the supported storage backends. This tutorial currently covers only the parameters for Consul and Integrated Storage (Raft) storage backends.
There are some operational characteristics and trade-offs around how the different storage engines handle memory, persistence, and networking that you should familiarize yourself with.
Consul storage backend characteristics:
Storage backend | Notes |
---|---|
Consul | The Consul storage backend currently has better disk write performance than the Integrated Storage backend. |
Pros | Working set is contained in memory, so it is highly performant. |
Cons | Operationally complex Harder to debug and troubleshoot Network hop involved, theoretically higher network latency More frequent snapshotting needed results in performance impact Memory bound with higher probability of out-of-memory conditions |
Integrated Storage backend (Raft) characteristics:
Storage backend | Notes |
---|---|
Raft | The Integrated Storage backend (Raft) currently has better network performance than the Consul storage backend. |
Pros | Operationally simpler Less frequent snapshotting since data is persisted to disk No network hop (trade off is an additional fsync() writing to BoltDB in the finite state manager) |
Cons | Data persisted to disk, so theoretically somewhat less performant Write performance currently slightly lower than with Consul |
With this information in mind, review details on specific tunable parameters for the storage backend that you are most interested in.
Consul
When using Consul for the storage backend, most of the disk I/O work will be done by the Consul servers and Vault itself is expected to have lower disk I/O usage. Consul keeps its working set in memory, and as a general rule of thumb, the Consul server should have physical memory equal to approximately 3x the working data set size of the key/value store containing Vault data. Sustaining good Input/Output Operations Per Second (IOPS) performance for the Consul storage is of utmost importance. Review the Consul reference architecture and Consul deployment guide for more details.
What are common errors?
If you observe extreme performance degradation in Vault while using Consul as a storage backend, a first look at Consul server memory usage and errors is helpful. For example, check the Consul server operating system kernel ring buffer or syslog for signs of out of memory (OOM) conditions.
$ grep 'Out of memory' /var/log/messages
If there are results, they will resemble this example.
kernel: [16909.873984] Out of memory: Kill process 10742 (consul) score 422 or sacrifice child
kernel: [16909.874486] Killed process 10742 (consul) total-vm:242812kB, anon-rss:142081kB, file-rss:68768kB
Another common cause of issues is reduced IOPS on the Consul servers. This condition can manifest itself in Vault as errors related to canceled context, such as the following examples.
[ERROR] core: failed to create token: error="failed to persist entry: context canceled"
[ERROR] core: failed to register token lease: request_path=auth/approle/login error="failed to persist lease entry: context canceled"
[ERROR] core: failed to create token: error="failed to persist accessor index entry: context canceled"
The key clue here is the "context canceled" message. This issue will cause intermittent Vault availability to all users, and you should attempt to remedy the issue by increasing the available IOPS for the Consul servers.
The following are some important performance related configuration settings that you should become aware of when using Consul for the Vault storage backend.
kv_max_value_size
One common performance constraint that can be encountered when using Consul for the Vault storage backend is the size of data Vault can write as a value to one key in the Consul key/value store.
As of Consul version 1.7.2 you can explicitly specify this value in bytes with the configuration parameter kv_max_value_size.
Default value: 512KB
Here is an example Consul server configuration snippet that increases this value to 1024KB.
"limits": {
"kv_max_value_size": 1024000
}
What are common errors?
The following error will be returned to a client that attempts to exceed the maximum value size.
Error writing data to kv/data/foo: Error making API request.
URL: PUT http://127.0.0.1:8200/v1/kv/data/foo
Code: 413. Errors:
* failed to parse JSON input: http: request body too large
Note that tuning this improperly can cause Consul to fail in unexpected ways, it may potentially affect leadership stability and prevent timely heartbeat signals by increasing RPC IO duration.
txn_max_req_len
This parameter configures the maximum number of bytes for a transaction request body to the Consul /v1/txn
endpoint. In situations where this parameter is set and kv_max_value_size
is also set, the higher value will take precedence for both settings.
Note that tuning this improperly can cause Consul to fail in unexpected ways, it may potentially affect leadership stability and prevent timely heartbeat signals by increasing RPC IO duration.
max_parallel
Another parameter that can sometimes benefit from tuning depending on the specific environment and configuration is the max_parallel parameter, which specifies the maximum number of parallel requests Vault can make to Consul.
The default value is 128.
This value is not typically increased to increase performance, rather it is most often called upon to reduce the load on an overwhelmed Consul cluster by dialing down the default value.
consistency_mode
Vault supports using 2 of the 3 Consul Consistency Modes. By default it uses the default mode, which is described as follows in the Consul documentation:
If not specified, the default is strongly consistent in almost all cases. However, there is a small window in which a new leader may be elected during which the old leader may service stale values. The trade-off is fast reads but potentially stale values. The condition resulting in stale reads is hard to trigger, and most clients should not need to worry about this case. Also, note that this race condition only applies to reads, not writes.
This mode is suitable for the majority of use cases and you should be aware that changing the mode to strong in Vault maps to the consistent mode in Consul. This mode comes with additional performance implications, and most use cases should not need this mode unless they absolutely cannot tolerate a stale read. The Consul documentation states the following about consistent mode:
This mode is strongly consistent without caveats. It requires that a leader verify with a quorum of peers that it is still the leader. This introduces an additional round-trip to all servers. The trade-off is increased latency due to an extra round trip. Most clients should not use this unless they cannot tolerate a stale read.
Integrated Storage (Raft)
Vault version 1.4.0 introduced a new Integrated Storage capability that uses the Raft Storage Backend. This storage backend is quite similar to Consul key/value storage in its behavior and feature-set. It replicates Vault data to all servers using the Raft consensus algorithm.
If you have not already, review Preflight Checklist - Migrating to Integrated Storage for additional information about Integrated Storage.
The following are tunable configuration items for this storage backend.
mlock()
Disabling mlock()
is strongly recommended if using Integrated Storage, as it does not interact well with memory mapped files such as those created by BoltDB, which is used by Raft to track state.
When using mlock()
, memory-mapped files get loaded into resident memory, which results in the complete Vault dataset to be loaded into memory, and this can result in out-of-memory conditions if Vault data becomes larger than the available physical memory.
Recommendation
Although the Vault data within BoltDB remains encrypted at rest, it is strongly recommended that you use the instructions for your OS and distribution to ensure that swap is disabled on your Vault servers which use Integrated Storage to prevent other sensitive Vault in-memory data from being written to disk.
What are common errors?
If you are operating a Vault cluster with an Integrated Storage backend and have not disabled mlock()
for the vault binary (and potentially any external plugins), then you would expect to encounter errors like this example when the Vault data exceeds the available memory.
kernel: [12209.426991] Out of memory: Kill process 23847 (vault) score 444 or sacrifice child
kernel: [12209.427473] Killed process 23847 (vault) total-vm:1897491kB, anon-rss:948745kB, file-rss:474372kB
performance_multiplier
If you have experience configuring and tuning Consul, you could already be familiar with its performance_multiplier configuration parameter, and Vault uses it in the same way in the context of the Integrated Storage backend to scale key Raft algorithm timing parameters.
The default value is 0.
Tuning this affects the time it takes Vault to detect leader failures and to perform leader elections, at the expense of requiring more network and CPU resources for better performance.
By default, Vault will use a lower-performance timing that is suitable for Vault servers with modest resources towards the lower end of the recommended , currently equivalent to setting this to a value of 5 (this default may be changed in future versions of Vault, depending on if the target minimum server profile changes). Setting this to a value of 1 will configure Raft to its highest-performance mode and is recommended for production Vault servers. The maximum allowed value is 10.
snapshot_threshold
Tip
This is a low-level parameter that should rarely need tuning.
Again, the snapshot_threshold parameter is similar to one you may have experience with in Consul deployments. If you are not familiar with Consul, there is an automatic snapshotting of raft commit data, and the snapshot_threshold
parameter controls the minimum number of raft commit entries between snapshots that are saved to disk.
The documentation further states the following about adjusting this value:
Very busy clusters experiencing excessive disk IO may increase this value to reduce disk IO and minimize the chances of all servers taking snapshots at the same time. Increasing this trades off disk IO for disk space since the log will grow much larger and the space in the raft.db file can't be reclaimed till the next snapshot. Servers may take longer to recover from crashes or failover if this is increased significantly as more logs will need to be replayed.
Resource limits & maximums
This section serves as a reference to some of the most common resource limitations and maximum values that you can encounter when tuning Vault for performance.
Maximum number of secrets engines
There is no specific limit for the number of enabled secrets engines.
Depending on the storage backend, with many thousands (potentially tens of thousands) of enabled secrets engines, you could hit a maximum value size limit (for example )
Maximum value size with Consul storage
The default maximum value size for a key in Consul key/value storage is the Raft suggested maximum size of 512KB. As of Consul version 1.7.2 this limit can be changed with kv_max_value_size.
Maximum value size with Integrated Storage
Unlike the Consul storage backend, Integrated Storage does not currently impose a maximum key value size. This means you should be cautious when deploying use cases on Integrated Storage that have the potential to create unbounded growth in a value.
While Integrated Storage is not as reliant on memory and subject to memory pressure due to how data is persisted to disk, using overly large values for keys can have an adverse impact on network coordination, voting, and leadership election. It is worth keeping in mind that Vault Integrated Storage is not designed to perform as a general purpose key/value database, and so using keys with unreasonably large values many times more than the default could be problematic depending on the use case and environment.
Help and reference
- Reference Architecture
- Deployment Guide
- Production Hardening tutorial
- Utilization Saturation and Errors
- telemetry
- systemd process properties
- Vault Enterprise Namespaces
- Least Recently Used (LRU) cache
- dstat documentation
- Implementing Dstat with Performance Co-Pilot
- perf: Linux profiling with performance counters
- The Go Memory Model
- Package runtime
- Goroutines
- vault.expire.num_leases metric
- snapshot_threshold
- mlock(2)
- Keep certificate lifetimes short, for CRL's sake
- Policies