Troubleshoot Vault Kubernetes Key Management

Beta feature

Beta functionality is stable but possibly incomplete and subject to change. We strongly discourage using beta features in production deployments of Vault.

Process cannot access Vault

If the vault-kube-kms process cannot connect to Vault, check its logs for error messages. If you deployed vault-kube-kms as a static Pod, you can view the logs with the Kubernetes CLI by providing the name of your static Pod and namespace under which you deployed it:

$ kubectl logs -n <deploy_namespace> <pod_name>

Incorrect policy configuration

If you see permission denied errors in the vault-kube-kms logs, verify that the Vault policy attached to the AppRole includes the required permissions:

path "<mount_path>/keys/<key_name>" {
    capabilities = ["read"]
}

path "<mount_path>/encrypt/<key_name>" {
    capabilities = ["update"]
}

path "<mount_path>/decrypt/<key_name>" {
    capabilities = ["update"]
}

path "sys/license/status" {
    capabilities = ["read"]
}

The sys/license/status path is required for Vault Enterprise validation. You must include read permission for the sys/license/status so Vault Kubernetes Key Management can verify your Vault Enterprise license. If you omit permissions for sys/license/status, the vault-kube-kms process cannot start.

Incorrect auth mount or AppRole name

If you see authentication errors in the vault-kube-kms process logs, verify the following:

The --approle-role-id flag matches the Role ID for your AppRole in Vault.
The --approle-secret-id-path flag references a valid file system path to a valid Secret ID for the AppRole.
The static Pod running the vault-kube-kms process has access to the file system path referenced by --approle-secret-id-path.
You enabled the AppRole auth method in Vault at the same mount path that you provided to the vault-kube-kms process.

Vault is unreachable

If you see connection errors in the vault-kube-kms process logs, verify the following:

The --vault-address flag points to a reachable Vault server.
Network policies or firewalls allow traffic from the control plane node to Vault.
If you use TLS, make sure one of the following is true:
- The --tls-ca-file flag points to a valid CA certificate.
- The system trust store includes a valid CA for the Vault server.

Expired AppRole Secret ID

If Vault Kubernetes Key Management worked previously but authentication starts failing, the AppRole Secret ID may have expired.

Generate a new Secret ID in Vault and replace the contents of the file at --approle-secret-id-path:
```
$ vault write -f auth/approle/role/<role_name>/secret-id
```
Copy the new Secret ID to the file on each control plane node.
Restart the vault-kube-kms process after updating the Secret ID to ensure it reads the updated file.

Vault Enterprise validation failing

Vault Kubernetes Key Management requires Vault Enterprise. The following error message indicates that the vault-kube-kms process connected to a Vault Community instance instead:

vault Community Edition detected - vault-kube-kms requires Vault Enterprise

API server cannot access Vault Kubernetes Key Management

If the Kubernetes API server cannot communicate with the vault-kube-kms process, Kubernetes reports errors when encrypting or decrypting data. Check the kube-apiserver logs for KMS-related error messages.

The Kubernetes API server communicates with the vault-kube-kms process over a Unix domain socket with 0770 permissions (owner and group). To communicate successfully, the API server process must run as root or with the same Unix group assignment as the vault-kube-kms process.

Socket address mismatch

The endpoint field in the Kubernetes EncryptionConfiguration must match the --listen-address flag passed to the vault-kube-kms process.

Verify that both values point to the same Unix socket path. For example, if your EncryptionConfiguration contains the following:

providers:
  - kms:
      apiVersion: v2
      name: vault-kms
      endpoint: unix:///var/run/kmsplugin/kms.sock

You must start the vault-kube-kms process with the same address:

--listen-address=unix:///var/run/kmsplugin/kms.sock

If the API endpoint and listen addresses match and you still see socket errors, verify that both the kube-apiserver Pod and the vault-kube-kms Pod have mounted the directory containing the socket. If either Pod fails to mount the directory, that Pod cannot access the Unix socket.

Missing or invalid configuration flags

Use the Kubernetes CLI to review the vault-kube-kms process logs in the static Pod:

$ kubectl logs -n <deploy_namespace> <pod_name>

Look for validation errors at the start of the logs. The vault-kube-kms process fails to start when configuration flags or environment variables parse or get passed incorrectly.

Encryption or decryption errors

An incompatible Transit key configuration can cause encryption and decryption operations to fail after the vault-kube-kms process connects to Vault.

If decryption fails for data encrypted with an older key version, verify the min_decryption_version setting on the Transit key in Vault:

$ vault read transit/keys/<KEY_NAME>

If min_decryption_version is higher than the key version used to encrypt the data, Vault rejects decryption requests. Lower the min_decryption_version or re-encrypt the data with a newer key version.

Changing min_decryption_version can prevent Kubernetes from reading data encrypted with older key versions. Verify that all encrypted data uses a key version at or above the new minimum before increasing the minimum version cutoff.

Similarly, if min_encryption_version is set, verify that the vault-kube-kms process uses a key version at or above the minimum.

Re-encrypting existing data

After rotating the Transit key in Vault, the existing data in etcd remains encrypted with the old key version. Kubernetes does not automatically re-encrypt existing data.

To re-encrypt existing Secrets with the new key version, perform a no-op write to force Kubernetes to re-encrypt them:

$ kubectl get secrets --all-namespaces -o json | kubectl replace -f -

You can repeat the no-op for all other resource types configured in your EncryptionConfiguration.

For large clusters or complex migrations, refer to the Kubernetes documentation on encrypting confidential data at rest for guidance on storage migration strategies.