Vulnerability and patch management of infrastructure images with HCP
Author: Randy Keener
This technical document outlines a HashiCorp-validated approach to implementing a 30-day repave cycle for vulnerability and patch management (VPM). By automating the creation, testing, and deployment of immutable infrastructure, this strategy minimizes vulnerability risks while ensuring consistency across cloud environments. HCP Terraform and HCP Packer are used to centralize image management and automate updates.
The following sections cover the prerequisites, architecture, processes, and step-by-step instructions to implement this solution and run a "proof of value" (PoV) using the provided code repositories.
Following this pattern will provide:
- A 30-day repave strategy: Replace instances every 30 days with patched images to minimize exposure to vulnerabilities
- VPM environment setup: Use Terraform configurations to deploy a test infrastructure and validate image update workflows
- Centralized infrastructure and patch management using HCP Packer and HCP Terraform
Target audience
This guide references the following roles:
- Security teams responsible for reducing open vulnerability age and demonstrating measurable improvements in vulnerability management
- Platform teams responsible for managing HCP Terraform and HCP Packer
Prerequisites
Implementing an immutable infrastructure model with a 30-day repave cycle requires certain technical and organizational prerequisites.
- Automation tools and infrastructure as code (IaC)
- HCP Terraform is required for declarative management, automated provisioning, and lifecycle management of infrastructure components.
- HCP Packer is necessary for building, versioning, and storing the metadata of immutable images. Integration with CI/CD pipelines ensures automated and regular image updates.
- Version Control System (VCS) e.g. Github, configured with Github Actions secrets and variables.
- Security and compliance frameworks
- Predefined security and compliance policies are needed, enforced through automated testing and validation. This might include tools such as Chef InSpec, OpenSCAP, or HashiCorp Sentinel for policy enforcement during image deployment.
- Applications involved must be compatible with the immutable infrastructure.
Background and best practices
Traditional patch management methods are often manual and reactive, creating several issues:
- Human error during manual configurations
- Tracking gaps for patch and configuration expiration
- Long lead times to address vulnerabilities
- Increased risk exposure due to the rapid pace of cloud changes
HashiCorp's immutable infrastructure approach replaces manual patching with automated repaving, ensuring that every deployed instance consistently runs the latest, secured images. By regularly repaving all images within 30 days, the risk of vulnerability exploitation is reduced. If a vulnerability has already been exploited and an attacker has set up mechanisms for lateral movement within the organization, repaving will eliminate these intrusions, whereas in-place patching might leave them intact.
Centralize image management with Packer
Centralizing image management with Packer enables scalable deployment through immutable images. Platform teams manage secure and consistent base images, while developers build application images that use a base image as a starting point, then adapt it to their application requirements.
- The platform team should define the base image requirements, including the operating system, security patches, and essential software packages.
- Use Packer to automate the creation of base images. Packer templates should be version-controlled to ensure traceability and reproducibility.
- Integrate automated testing into the Packer build process to validate the images.
- Use tools like HashiCorp Sentinel for policy as code to enforce compliance.
Automated build tests
Integrating automated tests during the Packer build stage ensures image quality and reliability before promotion between environments. This reduces the risk of faulty deployments and optimizes resource use. When integrated with CI/CD pipelines, it automates testing and ensures only validated images advance through environments.
30-day vulnerability management cycle
The cornerstone of this approach is immutable infrastructure, where instances are regularly repaved (recreated) using the latest, secure images, eliminating vulnerabilities before they can be exploited.
Automation best practices
To fully leverage the VPM workflow, adopt these best practices for automation:
- Integrate CI/CD pipelines:
- Automate Packer builds using GitHub Actions or a compatible CI/CD tool.
- Trigger builds on specific Git events (e.g., tags, releases) to ensure only validated changes are deployed.
- Use webhooks for notifications:
- Configure webhooks to notify workspace owners immediately when drift is detected.
- Automate redeployment based on webhook events to minimize response times.
- Implement Sentinel policies for compliance:
- Write Sentinel policies to block the deployment of non-compliant images or infrastructure.
- Include compliance checks within the CI/CD pipeline for early detection of policy violations.
Security and compliance considerations
To maintain a secure and compliant infrastructure, follow these guidelines:
- Use tools like InSpec, OpenSCAP, or Ubuntu Security Guide (USG) to validate that all images meet required security standards.
- Use HashiCorp Sentinel to enforce compliance throughout the image build and deployment lifecycle.
- Integrate CVE monitoring tools to triage and prioritize vulnerabilities for remediation.
Guidance on image management
A centralized image management process encompasses both base images and application images, ideally within the same repository. If a single repository is not feasible, it is recommended that these images be managed by the same team to ensure consistency and control.
A centralized approach allows application teams to contribute by raising pull requests to the central repository. This method effectively involves application teams in the image update process, thereby mitigating the risk of outdated application images when base images are updated.
A decentralized approach, where the base image is centrally managed, but application teams maintain their own repositories for application images, presents challenges in people management, such as ensuring timely notifications and updates from application teams.
For teams at the beginning of your journey in image management, a centralized approach is recommended. By centralizing, you maintain control over the update process, ensuring that both base and application images are consistently updated within the same repository.\
Validated architecture
The architecture incorporates an automated pipeline designed to streamline the 30-day cycle of infrastructure and application images. This pipeline includes distinct stages for image building, infrastructure deployment, and compliance validation, each of which is orchestrated to ensure efficiency, consistency, and security across deployments.
The following sections describe the components of the architecture.
- The pipeline begins with the automated creation of infrastructure images using HashiCorp Packer. Configurations are predefined to ensure images are consistent and meet baseline standards.
- Use Packer to publish all images to the HCP Packer artifact registry and schedule revocation.
- Packer templates are merged to the main branch at any frequency less than bi-weekly but trigger the CI pipeline to run Packer build only on git tags and releases.
- Track and update all HCP Terraform workspaces drifted using HCP Terraform Explorer.
- The pipeline automates the provisioning and configuration of infrastructure.
- The pipeline executes compliance checks and security scans that validate that both the infrastructure and the deployed applications adhere to organizational compliance requirements.
- Any non-compliant elements trigger alerts and, depending on configuration, can halt further pipeline stages to prevent potential risks.
This architecture is designed for continuous integration and delivery (CI/CD) environments, supporting a 30-day repave cycle while ensuring that every build and deployment meets security and compliance standards. By automating image building, deployments, and compliance validations, you reduce manual intervention, lower operational risk, and enhance overall infrastructure reliability.
People and process considerations
The Terraform: Operating Guide for Adoption contains extensive discussion concerning the people and process recommendations for platform operations.
Role | Responsibilities | Process Expectations |
---|---|---|
Security Team | Monitor vulnerabilities and enforce security policies. | Demonstrate measurable reduction in exposure time. |
Security Team | Implement automated scanning and monitoring. | Align patching schedules with platform operations. |
Platform Team | Manage infrastructure with Terraform and schedule patches. | Minimize downtime and validate changes thoroughly. |
Platform Team | Build secure base images with Packer. | Collaborate with security and application teams. |
VPM workflow
This workflow demonstrates the complete VPM lifecycle, from image creation to infrastructure updates, using HCP Packer and HCP Terraform.
The VPM workflow can be broken down into the following steps:
- Import the repositories into version control
- Configure HCP and HCP Terraform
- Create the Packer images and update the HCP Packer channel configuration
- Deploy a test infrastructure using the Packer images
- Update the Packer images and update the HCP Packer channel configuration
- Validate that drift detection flags the drift and update the deployed test infrastructure with the new images
Import the repositories into version control
Import the two repositories required for the VPM workflow into your version control system (VCS):
- Download the latest release ZIP files from both repositories
- Import the contents into your VCS
- Create a GitHub Oauth token with the permissions to access your VPM repositories
Note
If using a self-hosted VCS, enhanced HCP Terraform agents may be required (available with HCP Terraform Premium).
Initial HCP and HCP Terraform configuration
Create the necessary project, service principals and workspaces within HCP and HCP Terraform.
Create an HCP organization for the VPM environment
Create a service principal
hcp-admin-sp
with the Contributor role and Organization-level scopeGenerate keys for the service principal (the
client ID
andclient secret
)Create an HCP Terraform organization
Generate a Team API token for the Owners team
Create a variable set for HCP management credentials and activate the option Prioritize the variable values in this variable set
Define the following variables in the variable set:
Key Value Category Sensitive HCP_CLIENT_ID The client ID for the hcp-admin-sp service principal env Yes HCP_CLIENT_SECRET The client secret for the hcp-admin-sp service principal env Yes TF_VAR_HCP_CLIENT_ID The client ID for the hcp-admin-sp service principal env Yes TF_VAR_HCP_CLIENT_SECRET The client secret for the hcp-admin-sp service principal env Yes TFE_TOKEN The Owners Team API token env Yes Note
The bootstrapping code expects this variable set to be named
HCP management credentials
. If you decide to use another name, provide the alternate name as a parameter value for the variablehcp_credentials_vs_name
.Configure a VCS provider to provide your HCP Terraform organization access to the GitHub repositories you imported earlier
Bootstrap the environment
The code in the HashiCorp VPM Repository is broken down into four sections:
Directory | Purpose |
---|---|
bootstrap | Code to bootstrap the HCP and HCP Terraform environment |
hcp-configuration | Code to maintain the configuration of your HCP organization |
hcp-terraform-configuration | Code to maintain the configuration of your HCP Terraform organization |
pov-infrastructure | Code to deploy the example infrastructure |
When initially deploying the environment for VPM, you must execute the code in the following order:
- Code in the
bootstrap
directory - Code in the
hcp-configuration
directory - Code in the
hcp-terraform-configuration
directory
While the bootstrap
code is intended to be used only once, the code in the other two directories can be the base of a long-lived HCP and HCP Terraform configuration.
The pov-infrastructure
code depends on a separate repository HashiCorp VPM GitHub Actions Packer Repository used for defining the Packer templates and example GHA workflows to automate the creation of the Packer images.
The instructions in this document assume that you are using GitHub as your VCS and CI/CD platform. However the integration is relatively simple and adapting it to another solution should be easy to achieve.
Bootstrap HCP Terraform
Create a new workspace in the Default Project:
- When prompted, select VCS-Driven Workflow
- Select your GitHub VCS provider
- Choose the repository where you imported this setup configuration
- Set the Workspace Name to
hcp-bootstrap
- Set the Terraform Working Directory to
bootstrap
- Click Create
When prompted set the following variables (set the
vcs_repo_identifier
to point to the imported repository):Variable Name Value Category Sensitive HCL github_oauth_token
\<The oauth token> terraform Yes No hcp_configuration
{ workspace_name = "hcp-configuration" vcs_repo_branch = "main" working_directory = "hcp-configuration" vcs_repo_identifier = "owner/repository" } terraform No Yes hcp_terraform_configuration
{ workspace_name = "hcp-terraform-configuration" vcs_repo_branch = "main" working_directory = "hcp-terraform-configuration" vcs_repo_identifier = "owner/repository" } terraform No Yes vpm_pov_deploy_repo_identifier
(set it to the git repository identifier for this repo, following the format owner/repository) terraform No No Update the
HCP management credentials
variable set to apply it to this new workspaceTrigger a Terraform run
The bootstrap
code will:
- Create a project in HCP Terraform to manage the workspaces used to configure HCP and HCP Terraform.
- Apply the variable set with the HCP management credentials to the project.
- Create a workspace to manage the HCP configuration, and link it to the appropriate git repository.
- Create a workspace to manage the HCP Terraform configuration, and link it to the appropriate git repository.
Configure HCP
- Set workspace variables
- If necessary, set the workspace variables to overwrite default values.
- Inspect the default values in the Inputs section.
- Apply the configuration
- Access the
hcp-configuration
workspace in thehcp-configuration
project. - Initiate a Terraform run
- Validate that the chosen run type is Plan and Apply (standard).
- Access the
The hcp-configuration
code will:
- Create an HCP project for the VPM PoV.
- Create a project-level service principal to create and maintain Packer images.
- Assign to that service principal the contributor role.
- Create a project-level service principal to query the available images in the HCP Packer registry instance.
- Assign to that service principal the viewer role.
- Create an HCP Packer registry in the HCP project for the VPM POV.
- Configure the required image buckets and channels.
Configure HCP Terraform
- Configure AWS access
- The creation of the AWS OIDC identity provider for HCP Terraform will require AWS credentials with sufficient permissions.
- You must configure the workspace to provide these permissions:
- If the workspace is using the Remote execution mode, you may use static credentials (see the tutorial Create a credentials variable set)
- If the workspace is using Agent execution mode, you may use static credentials or use an IAM role (see Agent authentication).
- Set workspace variables. If necessary, set the workspace variables to overwrite default values. Inspect the default values in the Inputs section.
- Apply the configuration
- Access the
hcp-terraform-configuration
workspace in thehcp-configuration
project. - Initiate a Terraform run:
- Validate that the chosen Run Type is Plan and Apply (standard).
- Access the
The hcp-terraform-configuration
code will:
- Create an HCP Terraform project to deploy the VPM workflow infrastructure.
- Create an AWS OIDC identity provider for HCP Terraform.
- Create a variable set with the AWS dynamic credentials parameters and apply it to the project.
- Create a variable set with the HCP Packer bucket information and apply it to the project.
- Configure a Run Task named HCP-Packer-VPM and associate it with the HCP Packer registry created earlier.
- Create workspaces to deploy the VPM infrastructure under the HCP Terraform project created earlier. Configure the HCP-Packer-VPM run task for each of those workspaces (with the run stage set to Post-Plan, and the enforcement level set to Advisory).
- Create an admin team and a developer team, assigning each team the relevant permissions as an example of possible delegation of responsibilities.
Build the initial Packer images
The VPM workflow uses Packer to build and update images. The HashiCorp VPM GitHub Actions Packer Repository contains GitHub Action workflows (.github/workflows/), Packer pipelines (packer/), and Ansible playbooks (ansible/) in support of the VPM workflow.
The code includes CI/CD configuration to automate the creation of the images. This configuration is designed for use with GitHub Actions. If you are using a different CI/CD system the pipeline configuration will have to be ported to that system.
GitHub Actions workflows
The three workflows defined in .github/workflows
are responsible for running Packer (with HCP Packer integration). The base image workflows are triggered based on either a monthly cron schedule, or by a push to the repository. The downstream NGINX and MySQL pipelines are triggered by webhooks that the base workflow executes upon successful completion of a new build. Defaults for paths
and/or ignore-paths
have been set on these workflows to avoid excessive pipeline execution.
First, build the Packer images.
Create the following Github Actions variables:
Directory Description AWS_REGION
AWS region for AMI images AWS_GITHUB_ROLE_ARN
AWS IAM Role to assume HCP_PROJECT_ID
The HCP Project where the Packer registry resides HCP_PACKER_BUCKET_BASE_NAME
The name of the Packer bucket to create PACKER_IMAGE_OWNER
Metadata for the Packer images You may set those variables manually, or using the gh CLI. To use the gh CLI, first create a file with the variables defined (named
repo_vars.env
for example):AWS_REGION=<insert target AWS region> AWS_GITHUB_ROLE_ARN=<insert AWS IAM role to assume> HCP_PROJECT_ID=<insert HCP project ID for the HCP Packer registry> HCP_PACKER_BUCKET_BASE_NAME=<insert HCP Packer bucket name> PACKER_IMAGE_OWNER=<insert image owner>
Then, execute the command from the git repository directory on your system:
$ gh variable set -f repo_vars.env
Create the following GitHub Action secrets:
Secret Name Description HCP_CLIENT_ID HCP Service Principal Client ID HCP_CLIENT_SECRET HCP Service Principal Client Secret GH_PAT GitHub PAT with authorization to trigger workflows Additional secrets may be needed based on how access to AWS is configured. If using static credentials, the following additional secrets can be defined:
Secret Name Description AWS_ACCESS_KEY_ID ID of the AWS access key associated with an IAM account AWS_SECRET_ACCESS_KEY AWS secret access key associated with an IAM account AWS_SESSION_TOKEN AWS session token Refer to the "Configure AWS Credentials" Action for GitHub action for recommended methods of authenticating with AWS.
Build your base images using the Packer templates from the GitHub repository.
After creating the images, assign a version to the
production
channel of the image bucket.Note
The Terraform code that deploys the test infrastructure "subscribes" to the
production
channel of the image buckets. If an image version is not assigned to the channel, the Terraform configuration will not have an AMI id to use and the deployment will fail.
There are four example flows demonstrated in this repo:
- x86_64 Ubuntu base image sourced from Amazon with CIS Level 1 benchmark applied
- x86_64 Ubuntu base image with NGINX additionally deployed using Ansible
- x86_64 Ubuntu base image with MySQL additionally deployed using Ansible
- x86_64 RHEL 9.3 base image sourced from Amazon with CIS Level 1 benchmark applied
Each resulting image is deployed into the configured AWS account as an AMI available for compute provisioning and registered with HCP Packer.
CIS Level 1 Benchmark
The benchmark remediation may not produce 100% CIS Level 1 Benchmark compliance due to AWS/AMI requirements. Results from each mitigation process are available for inspection if needed at /var/lib/usg
.
Customizing
Customization of this repository can be achieved by duplicating and modifying the templated pipelines and/or playbooks. The base image can easily be supplemented with an additional shell provisioner or by uncommenting the Ansible provisioner to include security agents or other requirements. There is a corresponding empty playbook and requirements file for the base image that is unused by default and can be enabled as needed.
Deploy test infrastructure with Terraform
Once the Packer images are created and the v1
version assigned to the production
channel of the image buckets, the next step is to deploy the test infrastructure.
- Validate that the HCP Terraform configuration allows the workspaces to configure cloud infrastructure. This can be done a number of ways (static credentials, dynamic credentials, etc.) and can be configured at the workspace-level or the project-level (preferred).
- Execute the
terraform apply
operation on each workspace to provision the required resources (e.g., Nginx and MySQL VMs). - Once deployed, wait for health checks and drift detection reports to complete and confirm there are no flagged issues.
Update Packer images and channels
Once the Packer images are updated and the v2
version assigned to the production
channel of the image buckets, the next step is to redeploy the test infrastructure.
- Update the Packer templates to address the vulnerabilities.
- Use the CI/CD pipeline to rebuild the updated images.
- Assign the new image version to the production channel of the image bucket.
- Revoke the image version with the vulnerability,
Tip
This step will trigger the next health check run to detect infrastructure drift automatically.
Resolve drift and redeploy infrastructure
HCP Terraform detects drift when a new image version is assigned to the production channel.
- Manually initiate a health check run within the relevant workspaces to avoid waiting for the next scheduled health check (which may be too far in the future).
- Once the health check is completed review the Drift Detection report in HCP Terraform to confirm that the change in image version was detected as drift.
- After confirming the detected drift, initiate a
terraform apply
in the relevant workspaces within thevpm-pov
project. This action will redeploy the infrastructure, updating the virtual machines to use the new image. - Wait for the
terraform apply
to complete successfully. Verify that all updates were applied without errors and that the virtual machines are now using the new image. - After the workspaces have been updated, trigger a new health check to confirm that the drift detection report shows no issues, validating that the infrastructure is up-to-date.
Error handling and debugging
The following sections provide guidance on error handling and debugging for the Packer build and Terraform deployment processes.
Packer build failures
- Use Packer
post-processors
to run tests on the built images. For example, you can use theshell-local
post-processor to execute test scripts locally after the image is built. - Implement testing frameworks like Inspec or Serverspec to validate the configuration and functionality of your images. These tools can be used to write tests that ensure your image meets the required specifications.
- Use the
-on-error=ask
option when runningpacker build
. This allows you to choose whether to clean up or debug the build process if an error occurs. This is particularly useful for interactive debugging. - If you encounter issues, re-run the build with the
-debug
flag. This will pause the build process at each step, allowing you to inspect and troubleshoot the environment. - In case of errors, you can SSH into the staged VM to troubleshoot the provisioner script locally. This is useful for identifying issues that may not be apparent from the logs.
- Integrate your Packer builds with a CI/CD pipeline. This allows you to automate the testing process and ensure that only images that pass all tests are promoted to the next environment.
Terraform deployment issues
- Verify IAM roles and permissions to ensure Terraform has access to cloud resources.
- Run a
terraform plan
before applying changes to identify configuration errors. - If deployments fail, review drift detection reports for inconsistencies.
Conclusion
The 30-day repave cycle with HCP Packer and HCP Terraform provides a scalable, proactive solution to vulnerability management. By automating image creation, infrastructure deployment, and drift detection, this approach reduces the window of exposure to vulnerabilities while ensuring continuous compliance.
This VPM strategy not only strengthens your organization's security posture but also reduces operational overhead by shifting from manual patch management to immutable infrastructure.
After you complete this tutorial, consider the following next steps:
- Customize the Packer templates and Terraform modules to align with your organization's infrastructure needs.
- Schedule regular drift detection to identify and resolve inconsistencies early.
- Integrate additional tools and frameworks to further streamline patch management.
- Enforce stricter compliance rules as you transition the workflow to production environments.
Check out the following resources for more information:
- Best Practices and Configuration
- CI/CD Deployment Patterns
- Testing and Debugging
- Artifact Management with HCP Packer
- Git Integration for Packer
- Get Started with HCP Packer
- Build/Test Examples