Nomad
Run a Granite AI workload on Nomad
In this tutorial, you will use Terraform to create the underlying infrastructure on AWS and then start the Nomad cluster. Once the cluster is running, you will submit the jobspec for an AI workload described in the previous tutorial and then generate text using the Granite LLM. Finally, you will create a new custom model by fine tuning the original.
An Instruqt track is available that allows you to complete this tutorial using a hosted web-based session. The only requirement is a compatible web browser, so you do not need to install additional software on your local machine.
This tutorial also describes how to run this scenario on your local machine with your own AWS account using this tutorial's companion code repository on GitHub. Select the version you would like to run.
Create the Nomad cluster
Start the Instruqt track
Click on the Start interactive lab button below to open the Instruqt track. A page overlay appears that loads the track next to these instructions. We call this overlay "the lab." You can resize the lab at any time using the two horizontal lines at the top.
Launch Terminal
This tutorial includes a free interactive command-line lab that lets you follow along on actual cloud infrastructure.
At the top right, click Launch to start the track. The track will take about six minutes to load. Click the Start button in the bottom right corner after the track loads.
View the lab layout
The lab has two tabs visible at the top:
- CLI, a terminal session
- Editor, a visual code editor.
During this tutorial, you will run commands in the CLI tab and edit code in the Editor tab.
Cluster architecture
The following diagram illustrates the infrastructure currently deployed in your environment.
The Nomad cluster is running on AWS and consists of three nodes: one server node and two client nodes.
One client node is publicly accessible on port 80, which is the port that Open WebUI is configured to listen on. The other client node is not externally accessible and can only be reached from within the cluster. The S3 bucket for Open WebUI to store its application data.
Log in to the Nomad UI
Print the Nomad address and token values.
$ cat cluster-info
Open your web browser to the value of the nomad_UI
variable. The cluster has a self-signed certificate, so your browser will likely display a warning. Accept the certificate and then proceed to the Nomad UI.
Copy the token value from the nomad_management_token
variable.
At the top right, click the profile button that displays Anonymous Token and then click Sign out. Click the profile button again to open the Sign In page. On the Sign In page, paste the token value in the Secret ID field and then click Sign in with secret.
Once you are logged in, click Clients from the left navigation. Confirm that the AWS client nodes are listed.
Deploy the Ollama job
Export the Nomad environment variables to grant the Nomad CLI permission.
$ export $(cat nomad-env-vars | xargs)
Submit the Ollama job to Nomad.
$ nomad job run ollama.nomad.hcl
==> View this job in the Web UI: https://NOMAD_ADDR:4646/ui/jobs/ollama@default
==> 2025-08-13T15:14:12Z: Monitoring evaluation "c9732cf4"
2025-08-13T15:14:12Z: Evaluation triggered by job "ollama"
2025-08-13T15:14:12Z: Evaluation within deployment: "e73b496c"
2025-08-13T15:14:12Z: Allocation "0aa44a61" created: node "b99e0618", group "ollama"
2025-08-13T15:14:12Z: Evaluation status changed: "pending" -> "complete"
==> 2025-08-13T15:14:12Z: Evaluation "c9732cf4" finished with status "complete"
==> 2025-08-13T15:14:12Z: Monitoring deployment "e73b496c"
✓ Deployment "e73b496c" successful
2025-08-13T15:15:10Z
ID = e73b496c
Job ID = ollama
Job Version = 0
Status = successful
Description = Deployment completed successfully
Deployed
Task Group Desired Placed Healthy Unhealthy Progress Deadline
ollama 1 1 1 0 2025-08-13T15:25:08Z
The job appears in the list on the Nomad UI's Jobs page, with the Deploying
badge listed next to the job's name. Once the job is running, the green Healthy
badge appears next to the job.
Deploy the Open WebUI job
Click on the Editor tab. Open the jobspec at the path /nomad/openwebui.nomad.hcl
.
In a new browser tab, open Bcrypt Generator to generate a password and its hash value. You will use the hash value to configure the job, and the password to log in to OpenWebUI.
Copy the hash value. Then, in the jobspec, replace the placeholder BCRYPTED_PASSWORD
text with the hash value. Save the file.
$ nomad job run openwebui.nomad.hcl
==> View this job in the Web UI: https://NOMAD_ADDR:4646/ui/jobs/open-webui@default
==> 2025-08-13T15:18:06Z: Monitoring evaluation "4c3a485f"
2025-08-13T15:18:06Z: Evaluation triggered by job "open-webui"
2025-08-13T15:18:06Z: Evaluation within deployment: "2d47b283"
2025-08-13T15:18:06Z: Allocation "88e42b6e" created: node "3caf3c68", group "open-webui"
2025-08-13T15:18:06Z: Evaluation status changed: "pending" -> "complete"
==> 2025-08-13T15:18:06Z: Evaluation "4c3a485f" finished with status "complete"
==> 2025-08-13T15:18:06Z: Monitoring deployment "2d47b283"
✓ Deployment "2d47b283" successful
2025-08-13T15:20:48Z
ID = 2d47b283
Job ID = open-webui
Job Version = 0
Status = successful
Description = Deployment completed successfully
Deployed
Task Group Desired Placed Healthy Unhealthy Progress Deadline
open-webui 1 1 1 0 2025-08-13T15:30:46Z
This job also appears in the Nomad UI. Wait until the job is running before you continue the tutorial.
Use Nomad Actions to create the admin user
From the Jobs page in the Nomad UI, select open-webui. In the top right corner of the job details page, Actions and then select create-admin-user. This action registers the admin user with the password you chose and hashed with Bcrypt Generator.
Open the Open WebUI application
In your terminal, retrieve the IP address of your small client.
$ nomad node status -verbose
ID Node Pool DC Name Class Address Version Drain Eligibility Status
5e64f731-4400-3e51-df60-8290a17f4c69 large dc1 aws-large-private-client-0 <none> 13.217.36.111 1.10.1 false eligible ready
1b47fa05-1ca5-1b52-9996-dbda775f7ac3 small dc1 aws-small-public-client-0 <none> 3.88.150.209 1.10.1 false eligible ready
In this example, the client is available at https://3.88.150.209
. Open this address in a new tab in your browser. If your browser returns an error, try manually visiting the http
address instead of the https
address. The address is also available in the custom metadata attribute externalAddress
at the bottom of the client details page in the Nomad UI.
Log in to Open WebUI. Use the password you chose to generate the hash value. If you attempt to log in using the hash value instead of the password, you will encounter a log in error.
User email: admin@local.local
Password: YOUR_CHOSEN_PASSWORD
Interact with the Granite model
The granite-3.3
model is installed on Ollama, and it is pre-selected as the default.
In the text field, type a prompt such as Why is the ocean blue?
. Then click the Send Message button.
After submitting the prompt, return to the Nomad UI. Open the Jobs page and then click on the ollama job. Click on the ollama task group in the Recent Allocations list. Observe the CPU and memory resource increase under the Resource Utilization heading while the ollama-task
runs.
Customize your model
Ollama supports model customization with an Ollama model file. This configuration file allows you to create new models from existing ones and share them with others.
Ollama also has its own CLI for submitting and processing model files.
Use the nomad exec
command to connect to the allocation so that you can use the Ollama CLI. This command targets the allocation in the ollama-task
task and the ollama
job, as defined in the jobspec.
$ nomad exec -i -t -task ollama-task -job ollama /bin/bash
root@72cee5b9d64f:/#
Create the model file
Your new model will be functionally similar to the granite3.3
model, but responses will be phrased as though a pirate is responding.
Create the model file.
$ cat > pirate-granite3.3.modelfile << EOF
FROM granite3.3:2b
# sets the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# sets the context window size to 4096, this controls how many tokens the LLM can use as context to generate the next token
PARAMETER num_ctx 4096
# sets a custom system message to specify the behavior of the chat assistant
SYSTEM You are a pirate, acting as an assistant.
EOF
Submit the model file to Ollama
Submit the model file to Ollama to create the new model. This will take some time as Ollama downloads the base model.
$ ollama create pirate-granite-3.3 -f pirate-granite3.3.modelfile
gathering model components
pulling manifest
pulling 77bcee066a76 100% ▕███████████████████████████████████████████████████████████████████▏ 4.9 GB
pulling 3da071a01bbe 100% ▕███████████████████████████████████████████████████████████████████▏ 6.6 KB
pulling 4a99a6dd617d 100% ▕███████████████████████████████████████████████████████████████████▏ 11 KB
pulling 122661774644 100% ▕███████████████████████████████████████████████████████████████████▏ 417 B
verifying sha256 digest
writing manifest
success
using existing layer sha256:77bcee066a76dcdd10d0d123c87e32c8ec2c74e31b6ffd87ebee49c9ac215dca
using existing layer sha256:3da071a01bbe5a1aa1e9766149ff67ed2b232f63d55e6ed50e3777b74536a67f
using existing layer sha256:4a99a6dd617d9f901f29fe91925d5032600fcd78f315a9fa78c1667c950a3a5f
creating new layer sha256:8df85c83aa19f2d31977bdf9c3d0b725199dd3eefdf9443089f5d16d65734b00
creating new layer sha256:584e6cd273eda8359ff0d82570f61d00f0760a81545b4f7d80f8618a304b2f4f
writing manifest
success
Interact with the new model
In your browser, refresh the Open WebUI web page.
At the top left, click the model dropdown, which is still set to granite3.3
. Select the pirate-granite
model and then start a new chat.
In the text field, enter the prompt Why is the ocean blue?
and then click the Send Message button. The response contains the same content but is phrased as though a pirate is speaking.
Train AI models with custom datasets
One way of extending a model's functionality is retraining it with new data. This process is more involved than fine-tuning with Ollama. It requires more time and consumes more billable cloud resources, and it is typically accomplished by scripting with another language, such as Python. For example, HuggingFace provides a guide on training a model with a script.
You can run these training scripts as Nomad jobs using one of the exec
task drivers.
Clean up
Stop the Ollama job from the Nomad UI.
Navigate to the Jobs page, and then click the ollama job. On the right side of the page, click Stop job.
In the top right corner of the lab, click on the Overview button, then click on the Stop button to complete the scenario.
Next Steps
In this tutorial you created a Nomad cluster on AWS, ran a Granite model with Ollama and Open WebUI, interacted with the Granite model to generate text, and created a new model based on Granite 3.3.
Nomad is a flexible workload orchestrator that can support many kinds of AI workloads. For example, you can also use Nomad to allocate jobs directly to NVIDIA GPUs. For more information, refer to NVIDIA GPU Device plugin.
If you are a Nomad user who wants to learn more about using large language models in workloads, we recommend the following external resources to continue your learning: