Tokenize Data with Transform Secrets Engine
Note
Transform secrets engine requires Vault Enterprise Advanced Data Protection (ADP) license.
Challenge
When encrypting sensitive data, preservation of the original data format or length may be required to meet certain industry standards such as HIPAA or PCI. To fulfill this requirement, the transform secrets engine performs format preserving encryption (FPE).
However, there are organizations that care more about the irreversibility of the tokenized data and not so much about preserving the original data format. Therefore, the transform secrets engine's FPE transformation may not meet the governance, risk and compliance (GRC) strategy they are looking for due to the use of reversible cryptography to perform FPE.
Solution
Transform secrets engine has a data transformation method to tokenize sensitive data stored outside of Vault. Tokenization replaces sensitive data with unique values (tokens) that are unrelated to the original value in any algorithmic sense. Therefore, those tokens cannot risk exposing the plaintext satisfying the PCI-DSS guidance.

Characteristics of the tokenization transformation:
Non-reversible identification: Protect data pursuant to requirements for data irreversibility (PCI-DSS, GDPR, etc.)
Integrated Metadata: Supports metadata for identifying data type and purpose
Extreme scale and performance: Support for performantly managing billions of tokens across clouds as well as on-premise
Prerequisites
To perform the tasks described in this tutorial, you need:
- Vault Enterprise 1.6.0 or later with Advanced Data Protection module
- jq installed to process the JSON output for readability
Note
To explore Vault Enterprise features, you can sign up for a free 30-day trial.
Policy requirements
Note
For the purpose of this tutorial, you can use root
token to work
with Vault. However, it is recommended that root tokens are only used for just
enough initial setup or in emergencies. As a best practice, use tokens with
appropriate set of policies based on your role in the organization.
To perform all tasks demonstrated in this tutorial, your policy must include the following permissions:
If you are not familiar with policies, refer to the policies tutorial.
Lab setup
Open a terminal and start a Vault dev server with root
as the root token.
The Vault dev server defaults to running at 127.0.0.1:8200
. The server is
also initialized and unsealed.
Insecure operation
Do not run a Vault dev server in production. This approach is only used here to simplify the unsealing process for this demonstration.
Export an environment variable for the vault
CLI to address the Vault server.
Export an environment variable for the vault
CLI to authenticate with the
Vault server.
The Vault server is ready.
Setup the Transform secrets engine
Create a role named, mobile-pay
which is attached to credit-card
transformation. The tokenized value has a fixed maximum time-to-live (TTL) of 24
hours.

Enable the
transform
secrets engine attransform/
.Create a role named
mobile-pay
with a transformation namedcredit-card
.The
mobile-pay
role is created.The role is created but the
credit-card
transformation does not exist, yet.Create a transformation named
credit-card
which sets the generated token's time-to-live (TTL) to 24 hours.Output:
The
max_ttl
is an optional parameter which allows you to control how long the token should stay valid.NOTE: Set the
allowed_roles
parameter to a wildcard (*
) to allow all roles or with globs at the end for pattern matching (e.g.mobile-*
).Display details about the
credit-card
transformation.Notice that the
type
is set totokenization
.
Tokenize secrets
The Vault client applications must have the following in their policy to perform
tokenization transformation using the Transform secrets engine enabled at
transform/
.
Encode a value with the
mobile-pay
role with some metadata.The
ttl
value is an optional parameter. Remember that themax_ttl
was set to 24 hours when you created thecredit-card
transformation. You can overwrite that value to make the token's TTL to be shorter.In addition, you can set optional metadata about the data.
The output displays the encoded value.
Set the generated token value in a
MY_TOKEN
environment variable for testing.Example:
Retrieve the metadata of the token.
Notice that
expiration_time
is displayed. Since you have overwritten themax_ttl
, thettl
is set to 8 hours.Validate the token value.
Validate that the credit card number has been tokenized already.
Retrieve the original plaintext credit card value.
Convergent tokenization
Requirement
This feature requires Vault 1.11.0 or later.
If you run the command multiple times, you would notice that it returns a different encoded value every time.
Example:
In some use cases, you may want to have the same encoded value for a given input so that you can query your database to count the number of entries for a given secret.
Key derivation is supported to allow the same key to be used for multiple purposes by deriving a new key based on a user-supplied context value. In this mode, convergent encryption can optionally be supported, which allows the same input values to produce the same ciphertext.
Create a transformation named
credit-card-convergent
which sets the enables the convergent encryption. When you define a transformation, setconvergent=true
.Output:
Add the
credit-card-convergent
to themobile-pay
role.Encode a value using the
credit-card-convergent
transformation.Example output:
Run the command again.
Example output:
The same encrypted value is returned.
Lookup token
When the transformation is configured with convergent encryption, you can look up the tokenized value (token).
Encode the value using the
credit-card-convergent
transformation with time-to-live (TTL) of 8 hours.Example output:
Notice that the encoded value (token) is longer than the one without TTL.Look up the token for a card number, "5555-6666-7777-8888".
Output:
Now, look up with expiration of "any".
Output:
This returns two tokens. In absence of the "expiration" paramter, the command returns token with no expiration. When the expiration is set to "any", it returns tokens with any expiration.
Look up token that are expiration between a given range using
min_expiration
andmin_expiration
which are RFC3339 formatted time and date.Example:
Key rotation
Note
The automatic key rotation requires Vault Enterprise 1.12.0 or later.
Rotating keys regularly limits the amount of information produced by a key if that key ever becomes compromised. In this section, you are going to enable automatic key rotation for your tokenization keys.
Read the key information for
credit-card
transformation.Notice that the
latest_version
is 1.Rotate the key for
credit-card
transformation.Read the key information again.
The
latest_version
is now 2.Now, instead of manually rotating the key, configure the key to be automatically rotated every 90 days to reduce operational overhead.
Note
The minimum permitted value for the
auto_rotate_period
is 1 hour.Verify the configuration.
Output:
If the key gets compromised, you can rotate the key using the
transform/tokenization/keys/<transformation_name>/rotate
, and then set the
min_decryption_version
to the latest key version so that the older (possibly
compromised) key will not be able to decrypt the data.
Because the minimum rotation period you can set is 1 hour, you will need to come back later to see that the key is rotated.
Set the rotation period to 1 hour.
Encrypt another value for testing.
The value,
1234-5678-9012-3456
is encoded withcredit-card
key version of 2, and the returned encoded value is stored in theMY_TOKEN_2
environment variable.You can make sure that the environment variable holds the encoded value.
Wait for at least 1 hour to see the key has been rotated.
You can test that the data encoded by version 1 of the key can be still decoded because
min_decryption_version
is 1.Also, you should be able to decode
MY_TOKEN_2
.If you change the
min_decryption_version
to 2, you will be able to decodeMY_TOKEN_2
but notMY_TOKEN
.
Setup external token storage
Unlike format preserving encryption (FPE) transformation, tokenization is a stateful procedure to facilitate mapping between tokens and various cryptographic values (one way HMAC of the token, encrypted metadata, etc.) including the encrypted plaintext itself which must be persisted.
At scale, this could put a lot of additional load on the Vault's storage backend. To avoid this, you have an option to use external storage to persist data for tokenization transformation.
Note
Currently, PostgreSQL, MySQL, and MSSQL are supported as external storage for tokenization.
To demonstrate, run a PostgreSQL database in a Docker container. Create a new transformation named, "passport" which uses this PostgreSQL as its storage rather than using the Vault's storage backend.

Run PostgreSQL Docker image in a container.
Start a postgres
instance which listens to port 5432
, and the superuser
(root
) password is set to rootpassword
.
You can verify that the postgres container is running.
Create a new role, "global-id".
Create a store which points to the postgres.
Create a schema in postgres to store tokenization artifacts.
Create a new transformation named, "passport" which points to the postgres store.
Open another terminal and connect to the
postgres
container.Start
psql
.Check to verify that there is no entry.
Return to the terminal you were running Vault CLI, and encode some test data.
Example output:
Return to the postgres container, and check the data entry.
Example output:
As you encode more data, the table entry grows.
Enter
\q
to quit the psql session.Enter
exit
to exit out of the Docker container.
Bring your own key (BYOK)
When your use case requires an external key, users of Vault version 1.12.0 or greater can use BYOK functionality to import an existing encryption key that was generated outside Vault.
The target key for import can originate from an HSM or other external source, and must be prepared according to its origin before you can import it.
Note
Tokenization transformations with imported keys do not currently support convergent tokenization.
The example shown here will use a 256-bit AES key, referred to as the target key. To successfully import the target key, you must perform the following operations to prepare it.
Generate an ephemeral 256-bit AES key.
Wrap the target key using the ephemeral AES key with AES-KWP.
Wrap the AES key under the Vault wrapping key using RSAES-OAEP with MGF1 and either SHA-1, SHA-224, SHA-256, SHA-384, or SHA-512.
Delete the ephemeral AES key.
Append the wrapped target key to the wrapped AES key.
Base64 encode the result.
A specific code example for preparing and wrapping the key for import is beyond the scope of this tutorial. For more details about wrapping the key for import including instructions for wrapping key from an HSM, refer to the key wrapping guide.
Before you can wrap the key for import, you must read the wrapping key from Vault so that it can be used to prepare your key.
The output is the (4096-bit RSA) wrapping key.
Use the wrapping key value at step 3 in the previously detailed preparation steps. Once you have prepared and base64 encoded the ciphertext, export the value to the environment variable IMPORT_CIPHERTEXT
.
Example:
Create a new transformation role named legacy-system
to use for the transformation that you will import the key into.
Import the key into the application-form
transformation. Imported keys do not support rotation by default, so include the allow_rotation
parameter and set its value to true
so that you can also try rotating the imported key. Add the allowed_roles parameter and specify the legacy-system
role.
Try using the newly imported key to encode a value and some metadata.
The imported key is working, and the encoded value returned by the application-form
transformation is using the imported key.
Note
TO import subsequent versions of the key, you must use the import_version API endpoint.
Let's take a look a the key information.
The key's latest_version
is currently 1..
Rotate the key.
Note
Once an imported key is rotated within Vault, it will no longer
support importing key material with the import_version
endpoint.
Check the key information once more.
The key's latest_version
is currently 2., and you can no longer import external versions of the key as it is now internally maintained by Vault.
Clean up
Unset the
VAULT_TOKEN
environment variable.Unset the
VAULT_ADDR
environment variable.Unset the
IMPORT_CIPHERTEXT
environment variable.Stop and remove the Postgres container.
You can stop the Vault dev server by pressing Ctrl+C where the server is running. Or, execute the following command.
Summary
Transformation secrets engine introduced tokenization transformation feature which replaces sensitive data with unique value (token) that are unrelated to the original value in any algorithmic sense. This can help organizations to meet certain industry standards.
If retaining the original data format is important, refer to the Transform Secrets Engine to learn about the format preserving encryption (FPE) transformation.