Vault as an identity broker for zero trust security

15min
|
HCP
Vault

Authors: Dan Schneider and Ray Abaid

As a platform operator, Vault administrator, or Vault consumer, you use Vault to broker workload identities across disparate systems, allowing secure access without relying on static secrets. This pattern demonstrates Vault's role as an identity broker, verifying identities from one platform and granting dynamic, policy-enforced access to resources in another. By focusing on identity-first access rather than credential possession, you reduce secret sprawl, enhance zero trust security, and simplify hybrid/multi-cloud operations.

Target audience

This guide references the following roles:

Platform operator: Someone who manages your organization's infrastructure and security platforms. This may include security, platform, and DevOps teams. This role focuses on configuring external systems and integrations, such as Kubernetes clusters, AWS IAM roles, and PostgreSQL databases, to enable Vault's identity brokering.
Vault administrator: Someone who has access to configure Vault-specific components, such as authentication methods, secrets engines, and policies. This role is distinct from the platform operator and centers on Vault administration, potentially delegating tasks through namespaces in decentralized policy management scenarios.
Vault consumer: Someone who deploys and manages workloads, such as application and DevOps engineers, whose applications require brokered access to cross-platform resources.

Prerequisites

Platform operator:

Working knowledge of Kubernetes service accounts, AWS IAM roles and STS, PostgreSQL user management, and OpenID Connect (OIDC) or JSON Web Token (JWT) claims.

Vault administrator:

Working knowledge of Vault authentication methods, secrets engines, and policy syntax.

Vault consumer:

Working knowledge of Vault client integration in applications, AWS, and Kubernetes workloads.

Background and best practices

From secrets manager to identity broker

Organizations often view Vault primarily as a secrets management tool, yet this limited scope often undersells Vault's most transformative capabilities. Vault excels as an identity broker, verifying identities from one platform and brokering access to another. This identity-first approach enables zero trust security by shifting away from legacy credential-focused workflows, where system access relies on possession of static secrets (for example, long-lived API keys or passwords stored in code or configuration files) to continuous verification of the entity's true identity before granting short-lived, policy-bound access.

In legacy workflows, security depended on distributing long-lived secrets, often leading to sprawl, theft, or mishandling risks, and inconsistent policy enforcement across silos. Vault contrasts this by authenticating the identity first, then providing authorization based on central policies, and issuing leased dynamic secrets in a secure handoff. This never trust, always verify model aligns with zero trust: assume breach, minimize blast radius with ephemeral credentials, and enforce least privilege across locations.

The identity brokering transaction model

Identity brokering in Vault is a structured, three-phase transaction that transforms a verified external identity, typically from a non-human workload, such as a Kubernetes pod or AWS EC2 instance, into short-lived, policy-bound access to resources in another system. This model emphasizes Vault's role in reducing secret sprawl by treating credentials as ephemeral byproducts of continuous identity verification, rather than static assets. It aligns with zero trust ensuring access is granular, revocable, auditable, and context-aware.

The three phases are as follows.

Phase 1: Identity verification and token issuance (authentication)

Vault begins by authenticating the external identity using a dedicated authentication method, such as Kubernetes auth (with service account JSON Web Tokens (JWTs)) or AWS auth (with IAM role metadata). This may involve Vault making outbound calls to the external identity provider (for example, calling the Kubernetes API for TokenReview to validate the service account in real-time, or calling AWS STS) to validate the presented claims. Upon successful verification, Vault issues a short-lived token tied directly to the identity. This token inherits Vault policies based on the authentication method's role configuration, which includes bound constraints (for example, specific Kubernetes service account names or AWS IAM Amazon Resource Names (ARNs)) to validate the identity's attributes. Additional policies may also inherit from internal or external identity group membership.

This phase focuses on confirming who you are through Vault's authentication process, without granting any access to resources yet. The token acts as a secure, time-bound, and revocable proxy for the identity in subsequent phases.

Example: In brokering Kubernetes identities to AWS, a pod authenticates with its Kubernetes service account. Vault verifies the Kubernetes identity by calling the Kubernetes API's TokenReview endpoint to check real-time service account validity against bound constraints and issues a token, ensuring no static secrets are necessary.

Phase 2: Identity-to-permission mapping (authorization)

When the workload uses the authenticated Vault token to make a request, Vault evaluates it against its attached policies, which are deny-by-default rules written in HashiCorp Configuration Language (HCL), to determine permissible actions within Vault. These path-based policies control access to specific endpoints, such as those for requesting dynamic credentials from secret engines, enforcing least privilege.

Vault's identity subsystem governs this mapping, which binds the external identity to these policies during token creation. It bridges verification to authorized actions but remains Vault-internal, without yet interacting with the destination identity domain.

Example: A Kubernetes pod, leveraging its obtained Vault token, requests credentials from aws/creds/s3-read. Vault evaluates the token's attached ACL policies, and if they grant access to that path, Vault allows the request.

Phase 2b: Advanced governance with Sentinel policies (optional, Vault Enterprise/HCP Vault Dedicated)

For enhanced, context-aware controls, Sentinel policies, available in Vault Enterprise and HCP Vault Dedicated Plus tier, apply after access control list evaluation but before the request reaches the secrets engine backend. Sentinel uses policy-as-code (in the Sentinel language) to enforce dynamic rules based on request context, such as time, network origin, or parameters. Role Governing Policies (RGPs) tie to the identity for entity-specific checks, while Endpoint Governing Policies (EGPs) apply to paths for parameter validation. Sentinel is useful when you need additional logical context or conditions beyond static configurations in secret engine roles.

This sub-phase adds logic-based restrictions without altering core ACLs, which is ideal for regulated environments.

Example: An endpoint governing policy (EGP) could deny requests to a restricted AWS secrets engine role for elevated permissions if the client's IP is not from a trusted CIDR range (for example, your on-premises Kubernetes cluster's network), providing dynamic network-based control to prevent unauthorized brokering from untrusted locations.

Phase 2c: Multi-party approval with control groups (optional, Vault Enterprise/HCP Vault Dedicated)

In scenarios requiring oversight for sensitive brokering requests (for example, high-privilege credentials), control groups, another Vault Enterprise feature, can enforce a quorum-based approval workflow. Control groups wrap the request, requiring approvals from designated identities (for example, security team members) before proceeding. While less common for fully automated non-human workflows, it may be valuable for hybrid setups with elevated risks.

This sub-phase prevents unilateral access, mitigating insider threats during authorization.

Example: For a request to generate administrator-level PostgreSQL credentials from an AWS identity, control groups could mandate two approvals, adding human governance to the automated brokering flow.

Phase 3: Dynamic credential generation and external permission mapping (issuance)

Once authorized (including any optional governance from 2b/2c), the token requests credentials from a configured secrets engine role (for example, in the AWS or database engine). The secrets engine generates dynamic, short-lived credentials tailored to the target system, such as AWS or PostgreSQL credentials.

This phase delegates to the secrets engine, which uses its own administrator credentials or established trust relationship (for example, a privileged IAM role for AWS or administrator credentials for PostgreSQL) to interact with the external platform. The secrets engine role's configuration entirely defines the mapping of permissions in the target system (for example, IAM policy ARNs/document or SQL grant statements), not Vault's ACL policies. Leases assign time-to-live (TTL) values to these credentials, enabling automatic expiration and central revocation.

Example: For an AWS-authenticated EC2 instance identity that Vault brokers to PostgreSQL, the database secrets engine role uses its administrator credentials to create a temporary user with SELECT-only grants on specific tables. This directly translates the instance's IAM identity to scoped database access.

Align with NIST zero trust principles

The identity brokering model described in this pattern is not just a theoretical approach to security. It is a direct implementation of the principles outlined in National Institute of Standards and Technology (NIST) Special Publication 800-207, "Zero Trust Architecture" NIST defines a zero trust architecture (ZTA) as a system that moves defenses from traditional, static network perimeters to a focus on users, assets, and resources. This aligns with Vault's function as an identity broker.

According to NIST, the primary goal of zero trust is to "prevent unauthorized access to data and services coupled with making the access control enforcement as granular as possible." The Vault identity brokering pattern achieves this goal by adhering to several of NIST's core tenets:

No implicit trust: NIST states that zero trust "assumes there is no implicit trust granted to assets or user accounts based solely on their physical or network location." Vault embodies this principle by requiring every workload, regardless of whether it is on-premises or in the cloud, to authenticate its identity before granting access.
Per-session access: The guidance mandates that "Access to individual enterprise resources is granted on a per-session basis." Vault’s use of leases and dynamic, short-lived credentials is a direct implementation of this, ensuring that access is ephemeral and must be re-evaluated for each session.
Dynamic policy enforcement: NIST requires that "Access to resources is determined by dynamic policy, including the observable state of client identity, application/service, and the requesting asset." Vault’s policy engine, combined with its ability to verify identity-based metadata (like Kubernetes service account names or AWS IAM ARNs), provides this dynamic, context-aware enforcement.
Continuous authentication and authorization: A core tenet is that "All resource authentication and authorization are dynamic and strictly enforced before access is allowed." Vault’s model, where identities are continuously verified to renew or obtain secrets, shifts security from a one-time check at the perimeter to a constant cycle of validation, which NIST describes as the foundation of a ZTA.

By using Vault as an identity broker, your organization is building an architecture that directly aligns with official federal guidance for zero trust, going far beyond mere secrets management.

Validated architecture

Architecture diagram showing recommended replication setup with both DR and Performance replication to support identity brokering use cases

Implementation best practices

Prefer platform-native identity mechanisms: Use authentication methods tied to native platform identities (for example, Kubernetes auth for Kubernetes pods, AWS auth for EC2 instances, Azure auth for Azure resources) whenever possible. These methods eliminate the need for pre-distributed credentials, reducing security risks and operational overhead.
- AppRole or TLS auth, when used correctly, are alternatives when platform-native identity mechanisms are not feasible. However, each requires reliance on an external trusted orchestrator for correct implementation. Find more details in the Vault Operating Guide for Adoption.
Enforce least privilege: Use Vault's policy engine and secret engine roles to ensure that brokered identities have the minimum permissions required in their target domain.
Audit everything: Use Vault's detailed audit logs to monitor all identity verification (auth mount) and credential generation (secrets engines) for compliance and security visibility.
- It is a common misconception that dynamic secret patterns give you less insight into machine access across your environment. On the contrary, brokered identity transactions on Vault are deeply auditable, much more so than static credentials and their inevitable sprawl.
Implement replication for resiliency and scaling: Use disaster recovery (DR) replication for availability and resiliency, and performance replication for horizontal scaling.
- Identity brokering makes Vault a critical runtime dependency for applications and workloads; implementing both forms of replication allows you to achieve the required levels of operational robustness.

True identity brokering versus common anti-patterns

Identity brokering occurs when Vault translates a workload's native platform identity into temporary access for another system without creating new static secrets. Vault uses its primitives to bind the source identity to the target access, ensuring the transaction is identity-centric rather than secrets-centric. However, not all Vault usage qualifies as brokering. Understanding the distinction helps avoid common pitfalls and ensures you use Vault strategically in this context.

True brokering examples

Kubernetes to AWS: A Kubernetes pod authenticates to Vault using its service account. Vault verifies its identity and generates temporary AWS credentials for S3 access.
AWS to PostgreSQL: An EC2 instance authenticates to Vault using its IAM role. Vault verifies its identity and creates a short-lived PostgreSQL user and password.

Non-brokering examples

CI/CD secret injection: A CI/CD pipeline authenticates to Vault, fetches application secrets, and injects them into an application deployment as environment variables. This is not identity brokering because the pipeline uses its identity, not the application's runtime identity, to obtain the secrets. While this does provide automation and potentially qualifies as an evolutionary improvement of the legacy secrets-focused approach, it does not achieve zero trust.
Using long-lived AppRoles: Distributing long-lived AppRole credentials to an application is not brokering. The AppRole is a new set of credentials, not the workload's native platform identity. This replaces one static secret with another.

Replication for resiliency and scaling

To support robust identity brokering, use Vault Enterprise's two replication modes together: disaster recovery (DR) replication for failover resiliency and performance replication for horizontal scaling and low-latency access. This combination ensures Vault is a reliable runtime dependency, handling frequent authentications and dynamic credential generation without introducing downtime, latency, or orphaned leases in hybrid or multi-cloud environments.

Both modes replicate data asynchronously across clusters over encrypted TLS connections, but they differ in scope and function. DR provides full state mirroring for seamless failover, while performance replication enables active secondaries for distributed load. For a complete solution, pair each performance secondary with a dedicated DR secondary to combine global scaling with localized protection. This topology minimizes disruptions, preserves lease management, and upholds policy consistency and revocation controls.

The sections below detail each mode, followed by guidance on combining them effectively.

Disaster recovery replication

DR replication protects against major failures, like a cluster outage or regional disruption, by creating a fully mirrored, hot-standby secondary that takes over seamlessly. In identity brokering, this keeps cross-platform access continuous. If the primary fails with active leases brokering an AWS identity to a PostgreSQL credential, the secondary inherits the state and manages ongoing leases without data loss.

What gets replicated: Full synchronization of configurations, policies, secrets engines, auth methods, audit devices, KV secrets, encryption keys, tokens, and leases. Even local mounts replicate, creating a complete duplicate of the brokering setup.
Implications for identity brokering:
- Lease and token continuity: Tokens and leases replicate fully, so workloads face no disruption during a failover. For example, if a primary fails after generating a credential lease for a Kubernetes pod, the promoted secondary has knowledge of that lease and can enforce its time-to-live (TTL), renewal, or revocation. This prevents orphaned credentials.
- Failover in hybrid environments: In a multi-cloud setup, DR allows recovery without requiring workloads to re-authenticate or regenerate credentials. This fits a zero trust model by upholding "always verify" without gaps from downtime.
- Risk mitigation: Without DR, a primary failure orphans active leases, leaving certain dynamic credentials exposed indefinitely. DR prevents this by replicating the lease state, keeping the broker in control. Since secondaries are passive until promoted, DR focuses strictly on resiliency, not load balancing or scaling.

DR replication makes Vault a resilient broker, ensuring identity translations and credential management endure failures without weakening your security.

Performance replication

Performance replication scales horizontally and reduces latency by allowing active secondaries to handle local requests in distributed environments. This lets workloads in remote regions verify identities and generate credentials nearby, avoiding cross-region delays while centralizing policy writes for consistency.

What gets replicated: Selective shared state, including configurations, policies, secrets engines, auth methods, KV secrets, and encryption keys. Tokens and leases are local to each cluster and do not replicate.
Implications for identity brokering:
- Local tokens and leases: Authentication and lease management happen per cluster, so a workload ties to one Vault instance for its session. If that cluster fails, the workload must re-authenticate to another cluster, as its token is not shared. The originating cluster is the only one that can handle renewal or revocation.
- Scaling distributed brokering: In a multi-cloud scenario, performance replication spreads the load. An AWS workload in Asia can broker access to PostgreSQL through its local performance secondary, speeding up identity verification and credential issuance while the secondary forwards policy writes to the primary for global consistency.
- Risk mitigation: Performance replication alone does not prevent orphaned leases. If a PR cluster fails, its local leases disappear unless the cluster recovers. While some dynamic secrets have target-system enforced TTL values (like AWS STS tokens), others (like database credentials) require Vault to revoke them. Without the failed cluster, these credentials remain active, creating a security risk.

Combining DR and performance replication

For the strongest identity brokering setup, combine performance replication's scaling with DR replication's resiliency. Pair each performance secondary with its own dedicated DR secondary.

This nested structure uses performance replication to spread the brokering load across regions while DR protects against lease orphaning and data loss within those regions. Treat each performance secondary as a local primary for its workloads. If it fails, you promote its DR counterpart, which inherits all local tokens and leases, ensuring you can manage and revoke credentials.

This approach delivers both scale and resiliency, ensuring continuous verification, ephemerality, and revocation for dynamic credentials.

Use both DR and Performance replication to support identity brokering use cases

References and further reading

AI agent authentication with Vault

Vault hybrid and multi cloud deployments