aws-neuron · rohithkrn · Jan 8, 2025
@@ -28,7 +28,7 @@ Step 1: Set up Development Environment
 
 ::
 
-    source ~/aws_neuronx_venv_pytorch_2_5_nxd_inference/bin/activate
+    source /opt/aws_neuronx_venv_pytorch_2_5_nxd_inference/bin/activate
 
 
 3. Install the fork of vLLM (v0.6.x-neuron) that supports NxD Inference following :ref:`nxdi-vllm-user-guide`.
@@ -338,6 +338,10 @@ You should receive outputs shown in the client terminal shortly:
     "usage":{"prompt_tokens":42,"total_tokens":50,"completion_tokens":8},"prompt_logprobs":null}
 
 
+
+If the request fails, try setting ``export VLLM_RPC_TIMEOUT=180000`` environment variable. The timeout value depends on the
+model and deployment configuration used.
+
 To send a request with both text and image prompts:
 
 ::