Nomad
Nomad 1.10.x release notes
We are pleased to announce the following Nomad updates.
1.10.5 release highlights
Windows OS enhancements
Configure Nomad agents to write to the Windows Event Log. Refer to the agent configuration's
eventlogparameter for details.Use CLI commands to install and uninstall Nomad as a system service. Refer to the following for more information:
Keyring configuration validation
On agent startup, Nomad validates the keyring configuration type against supported values.
Scheduling performance improvement
We improved scheduling performance when checking reserved ports usage.
Changelog
Review the changelog for a list of bug and security fixes.
1.10.4 release highlights
Nomad logs and journald output
We have added functionality that enables you to retrieve journald output or the contents of the Nomad log file remotely using the CLI.
New monitor export command
For a given Nomad agent, this command retrieves journald logs or the
entire contents of the Nomad log file. Review the monitor export command
reference for usage, options, and examples.
Historical log capture for debugging
We added the following flags to the nomad operator debug command:
-log-lookback: Include historical journald logs in the debug capture.-log-file-export: Include Nomad agents' logfiles in the debug capture.
Refer to the nomad operator debug command
reference for details.
Blocked evaluation metrics
The nomad.nomad.blocked_evals.cpu and nomad.nomad.blocked_evals.memory
server metrics now have a node_pool label. This helps you determine the node
pool where a scaling operation is needed.
Review the server metrics in the Metrics reference.
Sentinel policy scope for CSI volumes Enterprise
Enterprise
We added a new submit-csi-volume Sentinel policy scope, which lets you apply
Sentinel policies to CSI volume creation and registration.
You may override soft-mandatory polices assigned to CSI volume creation and
registration, but you may not override hard-mandatory policies. Refer to the
EnforcementLevel
parameter in the Sentinel
Policies API for soft-mandatory and hard-mandatory definitions.
Refer to the following documentation:
Sentinel policy reference
Volumes API reference
CLI command reference
Changelog
Review the changelog for a list of bug fixes.
1.10.3 release highlights
Consul service registrations
We added the kind parameter to the service block in the job specification.
You may manually register a Consul service by specifying a Consul service
kind. Refer to the Consul Register Service HTTP API's Kind
parameter for a list of Consul service
Kind values.
Previously, you configured a Consul service mesh in your job specification's
gateway block. Now you may specify the kind of Consul service in the job
specification service itself. If you configure both a service kind and a
gateway in your job specification, the configured Consul service mesh gateway
takes precedence.
Refer to the job specification service block kind
parameter for details.
Docker task driver cgroup namespace support
You may specify the cgroup namespace in your job specification's Docker task driver configuration. This lets you run services that require a cgroup namespace, such as the Datadog Agent.
Refer to the Docker task driver's cgroupns
parameter for details.
NOMAD_UNIX_ADDR task environment variable
Use the NOMAD_UNIX_ADDR value as your NOMAD_ADDR when you want to use the
Nomad CLI with the task API's Unix socket.
This example sets the NOMAD_ADDR to the NOMAD_UNIX_ADDR environment
variable.
task "nomad-cli" {
driver = "raw_exec"
config { ... run `nomad` commands ... }
identity {
env = true
}
env {
NOMAD_ADDR = "${NOMAD_UNIX_ADDR}"
}
}
Refer to these resources for details:
- Nomad CLI environment variables
- Runtime environment settings job-related variables
- Runtime variable interpolation job-related variables
Changelog
Review the changelog for a list of bug fixes.
1.10.2 release highlights
Start stopped jobs
The nomad job start CLI command starts a stopped job. This differs from the
nomad job restart command, which restarts or reschedules allocations for a
running job.
Refer to the nomad job start command reference
for details.
Dynamic host volumes garbage collection enhancements
When a node is garbage collected, any dynamic host volumes on the node are orphaned in the state store. You generally don't want to automatically collect these volumes and risk data loss, so we enhanced garbage collection to let you delete orphaned dynamic host volumes.
We added the -force flag to the nomad volume delete command so that you can
delete the volume if the node has been garbage collected. Refer to the nomad
volume delete command reference for details.
For clusters running on ephemeral cloud instances, such as AWS
EC2 in an autoscaling group, deleting host volumes may add excessive friction.
The gc_volumes_on_node_gc client configuration parameter specifies that the
server should delete any dynamic host volumes on the node when garbage
collection deletes the node. Refer to the gc_volumes_on_node_gc parameter
definition for details.
Configure max number of allocations
The node_max_allocs parameter sets the maximum number of allocations that
Nomad may schedule on a client node. Refer to the node_max_allocs parameter
definition for details.
ACL policy with workload identity enhancements
Apply an ACL policy to a namespace.
When you apply an ACL policy to a namespace, Nomad applies the policy to all the jobs within the namespace. Refer to the workload-associated ACL policies documentation for details.
Find the ACL policies associated with the current workload identity or ACL token.
This enhancement lets you learn about ACL capabilities from within the workload identity tasks. Refer to the following resources for details:
Normalize IPv6 addresses
Apply RFC-5942 section 4 recommendations to IPv6 addresses. Nomad normalizes the addresses when it parses the configuration file so that the change runs through the whole system.
Option to render a job specification template only once
We added once mode to the template block. This allows templates to render
once without watching dependencies. Refer to the following resources for more
information:
- Consul Template Modes Once Mode for a thorough description.
- Job specification
templateblock'sonceparameter for Nomad behavioral changes.
Offline utilization reporting Enterprise
Enterprise
With this enhancement, Nomad periodically records usage metrics snapshots in the state store. Cluster administrators in air-gapped or otherwise secured environments may use the API or CLI to generate utilization reporting bundles from those usage metrics snapshots.
Refer to the following resources:
- The Operator Utilization HTTP API
- The
nomad operator utilizationcommand reference - The agent configuration
reportingblock'ssnapshot_retention_timeparameter
Breaking changes
In templates, we removed support for these non-hermetic Sprig functions:
sprig_date, sprig_dateInZone, sprig_dateModify, sprig_htmlDate,
sprig_htmlDateInZone, sprig_dateInZone, sprig_dateModify,
sprig_randAlphaNum, sprig_randAlpha, sprig_randAscii, sprig_randNumeric,
sprig_randBytes, sprig_uuidv4, sprig_env, sprig_expandenv, and
sprig_getHostByName.
The aforementioned Sprig functions posed a security risk in that they allowed reading environment variables or resolving domain names to IP addresses.
Changelog
Review the changelog for a list of security and bug fixes.
Upgrade details
Review the Nomad 1.10.2 upgrade guide.
1.10.1 release highlights
Override parameterized job's parent priority
Use the -priority flag to override the priority inherited from a parameterized
job's parent. Refer to the nomad job dispatch command's -priority
parameter for details.
Breaking changes
Remove Raft peer by address removed
Nomad 1.4.0 removed support for Raft Protocol v2, and this removed the ability
to remove Raft peers by address instead of peer ID. Nomad 1.10.1 removes the
non-functional -peer-address option for the operator raft
peer-remove command, and the
address parameter for the DELETE /v1/operator/raft/peer API.
Agent exit on reloading configuration errors
Errors encountered when reloading agent configuration now cause agents to exit. In prior versions, Nomad only logged configuration errors during reloads. This could lead to agents running but unable to communicate. Any other errors when parsing the new configuration are logged and the reload is aborted, consistent with the current behavior.
Changelog
Review the changelog for a list of security and bug fixes.
Upgrade details
Review the Nomad 1.10.1 upgrade guide.
1.10.0 release highlights
Dynamic host volumes
The dynamic host volumes feature brings a persistent storage option to your workload allocations.
Nomad dynamic host volumes manage storage for stateful workloads without
requiring a restart of the Nomad nodes to apply configuration changes. You
create dynamic host volumes via the CLI or API and then configure the job with
the volume and
volume_mount blocks in the job
specification.
Host volumes mount paths from the Nomad client into allocations. Nomad is aware of host volume availability and makes use of it for job scheduling. However, Nomad does not know about the volume's underlying characteristics, so you can use host volumes for both local persistent storage and for highly available networked storage.
Dynamic host volume governance Enterprise
Enterprise
Providing guardrails to platform consumers is an important aspect of the storage provisioning workflow when leveraging host volumes across a shared Nomad cluster. Nomad Enterprise supports these new capabilities to provide governance when provisioning host volumes:
Sentinel dynamic host volume objects
During volume creation, Nomad can evaluate all of the details within the dynamic host volume specification against Sentinel policies that define and enforce specific patterns.
For example, a policy that enforces the storage tier based on the environment or namespace specified would allow reserving more expensive NVME storage for specific workloads. Being able to apply policy to the volume specification gives you a method to enforce specific patterns while providing platform consumers with more flexibility around self-service volume provisioning. Refer to the Sentinel policy reference for more information.
Resource quota support
Nomad’s resource quota system now includes coverage for host volume capacity limits that you can apply to provisioned storage within a specific namespace. This helps control storage consumption within a namespace based on the maximum capacities defined during creation or when making updates to the maximum capacities over the lifecycle of the volume. Refer to the Resource quota specification for more information.
Namespace and node pool validation:
Dynamic host volumes live within the context of a specific namespace when created. When Nomad provisions volumes in a namespace targeting a specific node pool, Nomad evaluates the namespace node pool configuration to ensure that volume creation aligns with job placement rules for node pools. Refer to the Namespace specification for details on
node_pool_configparameters.
Resources
Refer to the following resources to learn more about dynamic host volumes:
- Host volumes section in the Considerations for stateful workloads guide for an overview and comparison of storage options
- Host volumes plugin specification for examples of how to write your own plugin to dynamically configure persistent storage on your Nomad client nodes
- Use Nomad dynamic host volumes to persist data for stateful workloads tutorial to learn how to create and use a dynamic host volume for persistent storage
OpenID Connect (OIDC) enhancements
Nomad 1.10 extends Nomad's OIDC SSO login feature with Private Key JWT and Proof Key for Code Exchange (PKCE).
Private Key JWT
Private Key JWT, also called client assertions, is a more secure alternative for client secrets. Instead of sending a simple secret, Nomad builds a JWT and signs it with a value that the OIDC provider verifies. In this approach, Nomad asserts a valid OIDC client without sending any secret information over the network.
Proof Key for Code Exchange (PKCE)
PKCE adds an extra layer of security to any OIDC auth method for both client secrets and client assertions.
Set the ACL auth method OIDCEnablePKCE
parameter to true to turn
on this extra security.
Note that not all OIDC providers support PKCE. In addition to enabling PKCE in Nomad, you may need to enable it in your OIDC provider's configuration.
Resources
- OIDC auth method guide for details on using OIDC with Nomad
- OIDC troubleshooting guide to review common issues and tips for setting up OIDC
- Authenticate users with SSO and Keycloak tutorial to configure Nomad and the Keycloak identity provider to automatically grant permissions in Nomad ACL.
Container Storage Interface (CSI) enhancements
We added the following:
CSI volume and plugin events to the event stream
Volume capabilities to the
nomad volume statuscommand outputThe ability to use a volume ID prefix search and wildcard namespace with the
nomad volume deletecommand. Refer to the GitHub pull request for details. Example usage:$ nomad volume create ./internal-plugin.volume.hcl ==> Created host volume internal-plugin with ID aeea91a0-06df-c16e-5403-ff82a2f28fd4 ✓ Host volume "aeea91a0" ready 2025-01-31T15:55:14-05:00 ID = aeea91a0-06df-c16e-5403-ff82a2f28fd4 Name = internal-plugin Namespace = default Plugin ID = mkdir Node ID = b4611abd-d4a8-c83a-b05e-7d9f5b44a179 Node Pool = default Capacity = 0 B State = ready Host Path = /run/nomad/dev/data/host_volumes/aeea91a0-06df-c16e-5403-ff82a2f28fd4 $ nomad volume delete -type host aeea91a0 Successfully deleted volume "aeea91a0-06df-c16e-5403-ff82a2f28fd4"!
UI URL hints added to CLI commands
We added UI URL hints to the end of common CLI commands and a -ui flag to
automatically open the generated link in your browser.
Showing UI URL hints is enabled by default. You have two options for turning off this feature:
Server: Modify the
show_cli_hintsparameter in your agent'suiblock configuration.CLI: Set the
NOMAD_CLI_SHOW_HINTSenvironment variable to0orfalse.$ nomad status No running jobs ==> View and manage Nomad jobs in the Web UI: https://localhost:4646/ui/jobs $ export NOMAD_CLI_SHOW_HINTS=0 $ nomad status No running jobs
Breaking changes
Go SDK API change for quota limits
In Nomad 1.10.0, the Go API for quotas has a breaking change. The
QuotaSpec.RegionLimit field is now of type QuotaResources instead of
Resources. The QuotaSpec.VariablesLimit field is deprecated in lieu of
QuotaSpec.RegionLimit.Storage.Variables and will be removed in Nomad 1.12.0.
Loading binaries from plugin_dir without configuration
Plugins stored within the plugin_dir
will now only be loaded when they have a corresponding
plugin block in the agent configuration
file. Nomad now skips any plugin found without a corresponding configuration block.
Vault and Consul integration changes
Nomad 1.10.0 removes the previously deprecated token-based authentication workflow for Vault and Consul. Nomad clients must now use a task's workload identity to authenticate to Vault and Consul and obtain a token specific to the task.
This table lists removed Vault fields and the new workflow.
| Field | Configuration | New Workflow |
|---|---|---|
vault.allow_unauthenticated | Agent | Tasks should use a workload identity. Do not use a Vault token. |
vault.task_token_ttl | Agent | With workload identity, tasks receive their TTL configuration from the Vault role. |
vault.token | Agent | Nomad agents use the workload identity when making requests to authenticated endpoints. |
vault.policies | Job specification | Configure and use a Vault role. |
Before upgrading to Nomad 1.10, perform the following tasks:
- Configure Vault and Consul to work with workload identity.
- Migrate all workloads to use workload identity.
Refer to the following guides for more information:
Consul template implicit workload identity removal
Nomad no longer creates an implicit Consul identity for workloads that don't
register services with Consul. Tasks that require Consul tokens for template
rendering must include a consul block
or specify an identity.
Nomad 1.8 deprecated disconnect fields removed
In Nomad 1.8, we introduced the disconnect block to replace the
max_client_disconnect, stop_after_client_disconnect, and
prevent_reschedule_on_list fields. In Nomad 1.10, we removed these fields, and
Nomad will ignore them if specified. Jobs should migrate to using the
disconnect block prior to upgrading.
Remote task driver support removed
In Nomad 1.10.0, we removed all support for remote task driver
capabilities. Nomad no longer detaches drivers with the RemoteTasks capability
when an allocation is lost. Also, Nomad does not detach remote tasks
when a node is drained. Workloads running as remote tasks should be migrated
prior to upgrading.
Sentinel apply command requires scope Enterprise
Enterprise
To prevent accidentally adding policies for volumes to the job scope, the nomad sentinel apply command now
requires the -scope option. Refer to the GitHub pull
request for details.
Affinity and spread updates are non-destructive
We fixed a scheduler bug so that updates to affinity and
spread blocks are no longer destructive. After a job update that changes only
these blocks, existing allocations remain running with their job version
incremented. If you were relying on the previous behavior to redistribute
workloads, you can force a destructive update by changing fields that require
one, such as the meta block.
Deprecations
Quota specification variable_limits deprecated Enterprise
Enterprise
The quota specification's variable_limits field is deprecated. We replaced it
with a new storage block
with a variables field, under the region_limit block. Existing quotas will
be automatically migrated during server upgrade. We will remove the
variables_limit field from the quota specification in Nomad 1.12.0.
Upgrade details
For more detailed information, refer to the upgrade details page and the GitHub releases changelogs.
Known issues
None.