Protecting Vault with Resource Quotas
Vault is an API driven system that all communication between the clients and Vault are done through Vault API.

As the number of client applications increases, rogue applications can degrade Vault performance. The issue is often caused by applications that are:
- Generating an unbounded number of leases and/or loading too much data into Vault leading to exhausting Consul storage backend's available memory
- Consuming an erroneously large amount of bandwidth leading to internal throttling of the Vault cluster as a whole
Solution
Set Vault resource quotas to protect your Vault environment's stability and network, as well as storage resource consumption from runaway application behavior and distributed denial of service (DDoS) attack.
The Vault operators can control how applications request resources from Vault, and Vault's storage and network infrastructure by setting the following:
Feature | Description | Vault OSS | Vault Enterprise |
---|---|---|---|
Rate Limit Quotas | Limit maximum amount of requests per second (RPS) to a system or mount to protect network bandwidth | ✔️ | ✔️ |
Lease Count Quotas | Cap number of leases generated in a system or mount to protect system stability and storage performance at scale | ✔️ |
Note
Lease Count Quotas requires Vault Enterprise Standard license.
To set the rate limit quotas and lease count quotas, use the sys/quotas/<type>
endpoint. Each resource quota has a name to identify the quota rule. To
manage an individual quota rule, the endpoint becomes
sys/quotas/<type>/<name>
.
The <type>
can be:
rate-limit
: rate limit quotalease-count
: lease count quota
Note
Batch tokens do not count towards the lease count quota. Creation of new batch tokens will, however, be blocked if the lease count quota is exceeded.
Note
Feature enhancements were introduced to both rate limit quota and lease count quota in Vault 1.12.0. Refer to each section for more details.
Prerequisites
To perform the tasks described in this tutorial, you need Vault v1.5 or later. Refer to the Getting Started tutorial to install Vault. Make sure that your Vault server has been initialized and unsealed.
Launch Terminal
This tutorial includes a free interactive command-line lab that lets you follow along on actual cloud infrastructure.
Lab setup
Open a terminal and start a Vault dev server with
root
as the root token.The Vault dev server defaults to running at
127.0.0.1:8200
. The server is also initialized and unsealed.Insecure operation
Do not run a Vault dev server in production. This approach is only used here to simplify the unsealing process for this demonstration.
Export an environment variable for the
vault
CLI to address the Vault server.Export an environment variable for the
vault
CLI to authenticate with the Vault server.
The Vault server is ready.
Resource quota configuration
To configure the resource quota, use the sys/quotas/config
endpoint.
Parameter | Type | Description |
---|---|---|
enable_rate_limit_audit_logging | boolean | Enable or disable audit logging when requests get rejected due to rate limit quota violations. By default, audit logging is disabled (false ). |
By default, the requests rejected due to rate limit quota violations are not
written to the audit log. Therefore, if you wish to log the rejected requests
for traceability, you must set the enable_rate_limit_audit_logging
to true
.
The requests rejected due to reaching the lease count quotas are always logged
that you do not need to set any parameter.
Note
Enabling the rate limit audit logging may have an impact on the Vault performance if the volume of rejected requests is large.
Enable a file audit device which outputs to
/var/log/vault-audit.log
(or your desired file location).To enable the audit logging for rate limit quotas, execute the following command.
Read the quota configuration to verify.
Rate limit quotas
Rate limit quotas are designed to protect Vault against external distributed denial of service (DDoS) attacks and are fundamental to Vault's security model. Therefore, it is a part of Vault's core feature set available in both OSS and Enterprise.
To set rate limit quotas, use the sys/quotas/rate-limit/<name>
endpoint.
Parameters
Parameter | Type | Description |
---|---|---|
name | string | Name of the quota rule |
path | string | Target path or namespace to apply the quota rule. A blank path configures a global rate limit quota |
rate | float | Rate for the number of allowed requests per second (RPS) |
role | string | Vault 1.12 or later: Login role to apply this quota to. When this parameter is set, the path must be configured to a valid auth method with a concept of roles. |
interval | second | The duration to enforce rate limiting for (default is 1 second) |
block_interval | string | If set, when a client reaches a rate limit threshold, the client will be prohibited from any further requests until after the 'block_interval' has elapsed. |
Note
If you are running Vault 1.12 or later, the path
can be a fully
qualified path, and it can end with *
(e.g., auth/token/create*
).
Create a rate limit quota named, "global-rate" which limits inbound workload to 500 requests per seconds.
Read the
global-rate
rule to verify its configuration.Note
In absence of
path
, this quota rule applies to the global level instead of a specific mount or namespace.Create a rate limit quota named, "transit-limit" which limits the access to the Transit secrets engine to be 1000 requests per minute (60 seconds).
First, enable Transit secrets engine at
transit
.Now, create a rate limit quota.
Read the
transit-limit
rule to verify its configuration.
Vault 1.12.0 or later
Previously, the path
depth had to be the mount of an auth method or secrets
engine. If you are running Vault 1.12 or later, you can set the path
to be
deeper than the mount point (in this example, transit/
).
Create a rate limit quota named, "transit-order" to limit the data encryption requests using
orders
key to be 500 per second.First, create an encryption key named, "orders".
Now, create the "transit-order" rate limit quota.
Verify the rate limit quota configuration.
Output:
Test
Be sure to visit the Test to understand the resource quota section to see how resource quotas enforcement works.
Vault Enterprise Example
Consider that you have K/V v2 secrets engine enabled at kv-v2
under
us-west
namespace. Create a rate limit quota named, "orders" which limits
the incoming requests against kv-v2
to 20 requests per second maximum.
Create a namespace, "us-west".
Enable kv-v2 secrets engine in the
us-west
namespace.Now, create the
orders
quota rule.Verify the configuration.
Lease count quotas
Note
Lease Count Quotas is a part of Vault Enterprise Platform features.
Lease count quota is designed to protect Vault from a large volume of leases and tokens persisted in Vault, which can pressure its storage backend. This acts as a guard rail for system stability in large-scale Vault Enterprise deployments where Vault is used as a service.
Parameters
Parameter | Type | Description |
---|---|---|
name | string | Name of the quota rule |
path | string | Target path or namespace to apply the quota rule. A blank path configures a global lease count quota |
max_leases | int | Maximum number of leases allowed by the quota rule |
role | string | Vault 1.12 or later: Login role to apply this quota to. When this parameter is set, the path must be configured to a valid auth method with a concept of roles. |
Note
If you are running Vault 1.12 or later, the path
can be a fully
qualified path, and it can end with *
(e.g., auth/token/create*
).
To demonstrate this feature, enable database secrets engine at "postgres" in the
us-west
namespace.Create a lease count quota named, "db-creds" which limits the incoming requests for a new set of DB credentials to 100 concurrent, valid leases maximum.
Note
Similar to rate limit quotas, the quota rules apply globally in the absence of
path
. In the example, you created a quota rule against theus-west/postgres
path. If a global quota rule exists on theroot
namespace, the quota rule defined on a specific path takes precedence.Verify the configuration.
Vault 1.12.0 or later
Previously, you could only apply to the path
of where the auth method was
enabled. When you are running Vault 1.12 or later, you can specify the role
if
the target auth method has the concept of roles.
Create a role named "webapp" for approle auth method.
First, enable the approle auth method.
Create "webapp" role.
Create a lease count quota named "webapp-tokens" which limits the creation of token for the
webapp
role to maximum of 100.If you want to set it for the approle auth method enable in the
us-west
namespace, thepath
should beus-west/auth/approle
.Verify the configuration.
Output:
Test
Be sure to visit the Test to understand the resource quota section to see how resource quotas enforcement works.
Test to understand the resource quotas
Now that you learned the basic commands, let's test to see how it works.
Rate limit quota test
Enable transit secrets engine if it is not enabled.
Create a "test" encryption key.
For the purpose of demonstration, create a "test-transit" rate limit quota such that you can only make 1 requests every 10 seconds.
Create a shortcut script,
test-encryption.sh
which makes request to encrypt data using "test" key.Ensure that the script is executable.
Run the script to see how the quota rule behaves.
Output:
The second and third requests failed because of the limit.
Earlier, you configured the resource quotas to enable audit logging of requests that were rejected due to rate limit quota rule violation. Inspect your audit log for its entry.
If your audit log path is not
/var/log/vault-audit.log
, be sure to set it to the correct path.You should find an error message indicating that rate limit quota was exceeded. You can trace the audit log to see how many requests were rejected due to the rate limit quota. It may be working as expected or you may find suspicious activities against a specific path.
If you are running Vault 1.12.0 or later, create another rate limit quota to specify the
path
totransit/encrypt/test
where allowed rate is 2 with interval of 10 seconds.Run the script again to see how the quota rule behaves.
Output:
Note
This time, only the last request failed. When a more granular path is set, the rate limit quota rule takes precedence against the path.
If you have another key or try to decript (
transit/decrypt/test
), therate-test
rate limit quota will be applied.
Lease count quota test
Similarly, create a very limited lease count quota named, "lease-test" which applies on the
root
level. It only allows 3 tokens and leases to be stored.Create a shortcut script,
lease-count-test.sh
which invokes thetest
path.Make sure that the script is executable.
Run the script to see how the quota rule behaves.
Three tokens were created successfully; however, the fourth request failed due to the lease count quota. Your output should look similar to follow.
Note
If your Vault server already has client tokens, the lease count quota may be exceeded sooner.
Also, you can find the
lease count quota exceeded
error in the audit log. (Be sure to set it to the correct audit log path for your environment.)If you revoke one of the tokens, you should be able to request a new one.
Example:
Now, request a new one.
The best practice is to set the tokens and leases' time-to-live (TTL) to be short and don't let them hang around longer than necessary. The lease count quotas allow you to set the upper limit to protect your Vault environment from running into an issue due to a lack of token and lease governance.
Delete the
lease-test
quota rule.
Vault 1.12.0 or later
If you are running Vault 1.12.0 or later, you can set more granular lease count quota.
To demonstrate, enable approle auth method.
Create a role named "test-role".
Create a lease count quota to limit the max number of leases for the
test-role
to 2.Retrieve the role ID of the
test-role
and store it inrole_id.txt
.Generate a secret ID of the
test-role
and store it insecret_id.txt
.Create a test script.
Ensure that the script is executable.
Run the script to see how the quota rule behaves.
Output:
The last attempt failed. The
test-role-limit
lease count quota should not affect other AppRole roles.
Clean up
If you wish to clean up your environment after completing the tutorial, follow the steps in this section.
Unset the
VAULT_TOKEN
environment variable.Unset the
VAULT_ADDR
environment variable.Delete files created during the test.
If you are running Vault locally in
-dev
mode, you can stop the Vault dev server by pressing Ctrl+C where the server is running. Or, execute the following command.
Next steps
In this tutorial, you learned the basic commands to set resource quotas to protect your Vault environment. To leverage this feature, you need Vault 1.5 or later.
Rate limit quotas allow Vault operators to set inbound request rate limits which
can be set on the root
level or a specific path. This is available in both
Vault OSS and Vault Enterprise.
Lease count quotas require Vault Enterprise Platform and allow operators to set the maximum number of tokens and leases to be persisted at any given time. This can prevent Vault from exhausting the resource on the storage backend.
You also learned that audit logging can be enabled to trace the number of requests that were rejected due to the rate limit quota.