Every single interaction with Vault, whether to put secrets into the key/value store or to generate new pairs of database credentials for the MySQL database, needs to go through Vault’s API.
One side effect of the API driven model is that applications and users can misbehave by overwhelming system resources through consistent and high volume API requests resulting in denial-of-service issues in some Vault nodes or even the entire Vault cluster. This risk is significantly increased when Vault API endpoints are exposed to thousands or millions of services deployed across a global infrastructure, especially in use cases of Vault where Vault is deployed as a service for internal developers.
Vault provides a feature, resource quotas, that allows Vault operators to specify limits on resources used in Vault. Specifically, Vault allows operators to create and configure API rate limits. Users of Vault Enterprise have a second option, alongside rate limits, lease-count quotas, which can limit the number of leases that can be in use at one time.
Vault allows operators to create rate limit quotas which enforce API rate
limiting using a token bucket algorithm.
A rate limit quota can be created at the root level or defined on a namespace,
mount, or full API path by specifying a
path when creating the quota. The rate
limiter is applied to each unique client IP address on a per-node basis (i.e. rate limit
quotas are not replicated). A client may invoke
rate requests at any given second,
after which they may invoke additional requests at
A rate limit quota defined at the root level (i.e. empty
path) is inherited by
all namespaces and mounts. It acts as a single rate limiter for the entire Vault
API. A rate limit quota defined on a namespace takes precedence over the global
rate limit quota, a rate limit quota defined for a mount takes precedence over
the global and namespace rate limit quotas, and a rate limit quota defined for
a specific path will take precedence over global, namespace, and mount quotas.
Additionally, quotas defined with a
role to limit login requests made using
that role on the specified auth mount will take precedence over all other quotas.
In other words, the most specific quota rule will be applied.
A rate limit can be created with an optional
block_interval, such that when set
to a non-zero value, any client that hits a rate limit threshold will be blocked
from all subsequent requests for a duration of
Vault also allows the inspection of the state of rate limiting in a Vault node through various metrics exposed and through enabling optional audit logging.
By default, the following paths are exempt from rate limiting. However, Vault
operators can override the set of paths that are exempt from all rate limit
resource quotas by updating the
rate_limit_exempt_paths configuration field.
Refer to Protecting Vault with Resource Quotas for a step-by-step tutorial.
Rate limit quotas can be managed over the HTTP API. Please see Rate Limit Quotas API for more details.