clearml · ainoam · Dec 10, 2025 · Dec 7, 2025 · Dec 7, 2025 · Dec 9, 2025
diff --git a/docs/webapp/applications/apps_embed_model_deployment.md b/docs/webapp/applications/apps_embed_model_deployment.md
@@ -92,10 +92,10 @@ values from the file, which can be modified before launching the app instance
 * **Instance name** - Name for the Embedding Model Deployment instance. This will appear in the instance list
 * **Service Project** - ClearML Project where your Embedding Model Deployment app instance will be stored
 * **Queue** - The [ClearML Queue](../../fundamentals/agents_and_queues.md#what-is-a-queue) to which the Embedding Model 
-Deployment app instance task will be enqueued (make sure an agent is assigned to it)
+Deployment app instance task will be enqueued. Make sure an agent is assigned to that queue.
 
   :::tip Multi-GPU inference
-  To run multi-GPU inference, the queue must be associated with a multi-GPU pod template. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
+  To run multi-GPU inference, ensure the queue's pod specification (from the base template and/or `templateOverrides`) requests multiple GPUs. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
   for an example configuration of a queue that allocates multiple GPUs and shared memory.
   :::
 

diff --git a/docs/webapp/applications/apps_llama_deployment.md b/docs/webapp/applications/apps_llama_deployment.md
@@ -88,10 +88,10 @@ values from the file, which can be modified before launching the app instance
 * **Service Project (Access Control)**: The ClearML project where the app instance is created. Access is determined by 
   project-level permissions (i.e. users with read access can use the app instance).
 * **Queue**: The [ClearML Queue](../../fundamentals/agents_and_queues.md#what-is-a-queue) to which the 
-  llama.cpp Model Deployment app instance task will be enqueued (make sure an agent is assigned to it)  
+  llama.cpp Model Deployment app instance task will be enqueued. Make sure an agent is assigned to that queue. 
 
   :::tip Multi-GPU inference
-  To run multi-GPU inference, the queue must be associated with a multi-GPU pod template. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
+  To run multi-GPU inference, ensure the queue's pod specification (from the base template and/or `templateOverrides`) requests multiple GPUs. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
   for an example configuration of a queue that allocates multiple GPUs and shared memory.
   :::
 

diff --git a/docs/webapp/applications/apps_model_deployment.md b/docs/webapp/applications/apps_model_deployment.md
@@ -91,10 +91,10 @@ values from the file, which can be modified before launching the app instance
 * **Service Project (Access Control)**: The ClearML project where the app instance is created. Access is determined by 
   project-level permissions (i.e. users with read access can use the app).
 * **Queue**: The [ClearML Queue](../../fundamentals/agents_and_queues.md#what-is-a-queue) to which the vLLM Model Deployment app 
-instance task will be enqueued (make sure an agent is assigned to that queue)
+instance task will be enqueued. Make sure an agent is assigned to that queue.
 
   :::tip Multi-GPU inference
-  To run multi-GPU inference, the queue must be associated with a multi-GPU pod template. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
+  To run multi-GPU inference, ensure the queue's pod specification (from the base template and/or `templateOverrides`) requests multiple GPUs. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
   for an example configuration of a queue that allocates multiple GPUs and shared memory.
   :::
 

diff --git a/docs/webapp/applications/apps_sglang.md b/docs/webapp/applications/apps_sglang.md
@@ -93,7 +93,7 @@ values from the file, which can be modified before launching the app instance
 instance task will be enqueued. Make sure an agent is assigned to that queue.
 
   :::tip Multi-GPU inference
-  To run multi-GPU inference, the queue must be associated with a multi-GPU pod template. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
+  To run multi-GPU inference, ensure the queue's pod specification (from the base template and/or `templateOverrides`) requests multiple GPUs. See [GPU Queues with Shared Memory](../../clearml_agent/clearml_agent_custom_workload.md#example-gpu-queues-with-shared-memory)
   for an example configuration of a queue that allocates multiple GPUs and shared memory.
   :::