Nomad
disconnect Block
| Placement | job -> group -> disconnect |
The disconnect block describes the system's behavior in case of a network
partition. By default, without a disconnect block, if an allocation is on a
node that misses heartbeats, the allocation will be marked lost and will be
replaced.
Replacement happens when a node is lost. When a node is drained, Nomad
migrates the allocations instead, and Nomad ignores the disconnect
block. When a Nomad agent fails to set up the allocation or the tasks of an
allocation fail more than their restart block allows, Nomad
reschedules the allocations and ignores the disconnect.
job "docs" {
group "example" {
disconnect {
lost_after = "6h"
replace = false
reconcile = "keep_original"
}
}
group "example2" {
disconnect {
stop_on_client_after = "12h"
replace = false
reconcile = "keep_original"
}
}
}
Note that you cannot use bothlost_after and stop_on_client_after in the
same disconnect block.
disconnect Parameters
lost_after(string: "")- Specifies a duration during which a Nomad client will attempt to reconnect allocations after it fails to heartbeat in theheartbeat_gracewindow. It defaults to "", which is equivalent to having the disconnect block be nil.You cannot use
lost_afterandstop_on_client_afterin the samedisconnectblock.Refer to the Lost After section for more details.
replace(bool: false)- Specifies if Nomad should replace the disconnected allocation with a new one rescheduled on a different node. Nomad considers the replacement allocation a reschedule and obeys the job'srescheduleblock. If false and the node the allocation is running on disconnects or goes down, Nomad does not replace this allocation and reportsunknownuntil the node reconnects, or until you manually stop the allocation.`nomad alloc stop <alloc ID>`If true, a new alloc will be placed immediately upon the node becoming disconnected.
stop_on_client_after(string: "")- Specifies a duration after which a disconnected Nomad client will stop its allocations. Settingstop_on_client_aftershorter thanlost_afterandreplace = falseat the same time is not permitted and will cause a validation error, because this would lead to a state where no allocations can be scheduled.The Nomad client process must be running for this to occur.
You cannot use
stop_on_client_afterandlost_afterin the samedisconnectblock.Refer to the Stop After section for more details.
reconcile(string: "best_score")- Specifies which allocation to keep once the previously disconnected node regains connectivity. It has four possible values which are described below:keep_original: Always keep the original allocation. Bear in mind when choosing this option, it can have crashed while the client was disconnected.keep_replacement: Always keep the allocation that was replaced to replace the disconnected one.best_score: Keep the allocation running on the node with the best score.longest_running: Keep the allocation that has been up and running continuously for the longest time.
disconnect Examples
The following examples only show the disconnect blocks. Remember that the
disconnect block is only valid in the placements listed previously.
Stop After
This example shows how stop_on_client_after interacts with
other blocks. For the first group, after the default 10 second
heartbeat_grace window expires and 90 more seconds passes, the
server replaces the allocation. The client waits 90 seconds
before sending a stop signal (SIGTERM) to the first-task
task. After 15 more seconds because of the task's kill_timeout, the
client will send SIGKILL. The second group does not have
stop_on_client_after, so the server replaces the
allocation after the 10 second heartbeat_grace expires. It will
not be stopped on the client, regardless of how long the client is out
of touch.
Note that if the server's clocks are not closely synchronized with each other, the server may replace the group before the client has stopped the allocation. Operators should ensure that clock drift between servers is as small as possible.
Note also that a group using this feature will be stopped on the client if the Nomad server cluster fails, since the client will be unable to contact any server in that case. Groups opting in to this feature are therefore exposed to an additional runtime dependency and potential point of failure.
group "first" {
disconnect {
stop_on_client_after = "90s"
}
task "first-task" {
kill_timeout = "15s"
}
}
group "second" {
task "second-task" {
kill_timeout = "5s"
}
}
Lost After
By default, allocations running on a client that fails to heartbeat will be marked "lost". When a client reconnects, its allocations, which may still be healthy, will restart because they have been marked "lost". This can cause issues with stateful tasks or tasks with long restart times.
Instead, an operator may desire that these allocations reconnect without a
restart. When lost_after is specified, the Nomad server will mark
clients that fail to heartbeat as "disconnected" rather than "down", and will
mark allocations on a disconnected client as "unknown" rather than "lost".
These allocations may continue to run on the disconnected client. Replacement
allocations will be scheduled according to the allocations' replace settings
until the disconnected client reconnects. Once a disconnected client reconnects,
Nomad will compare the "unknown" allocations with their replacements will
decide which ones to keep according to the reconcile setting.
If the lost_after duration expires before the client reconnects,
the allocations will be marked "lost". Clients that contain "unknown"
allocations will transition to "disconnected" rather than "down" until the last
lost_after duration has expired.
In the example code below, if both of these task groups were placed on the same
client and that client experienced a network outage, both of the group's
allocations would be marked as "disconnected" at two minutes because of the
client's heartbeat_grace value of "2m". If the network outage continued for
eight hours, and the client continued to fail to heartbeat, the client would
remain in a "disconnected" state, as the first group's lost_after
is twelve hours. Once all groups' lost_after durations are
exceeded, in this case in twelve hours, the client node will be marked as "down"
and the allocation will be marked as "lost". If the client had reconnected
before twelve hours had passed, the allocations would gracefully reconnect
using the strategy defined by reconcile.
Lost After is useful for edge deployments, or scenarios when
operators want zero on-client downtime due to node connectivity issues. This
setting cannot be used with stop_on_client_after.
# server_config.hcl
server {
enabled = true
heartbeat_grace = "2m"
}
# jobspec.nomad
group "first" {
disconnect {
lost_after = "12h"
reconcile = "best_score"
}
task "first-task" {
...
}
}
group "second" {
disconnect {
lost_after = "12h"
reconcile = "keep_original"
}
task "second-task" {
...
}
}