fix: updates all old inference URLS to api.aws.us-east-1.cerebrium.ai

jonoirwinrsa · jonoirwinrsa · commit 75db74aa9600 · 2025-09-05T08:16:30.000-05:00
diff --git a/cerebrium/container-images/custom-web-servers.mdx b/cerebrium/container-images/custom-web-servers.mdx
@@ -53,7 +53,7 @@ The configuration requires three key parameters:
 <Info>
   For ASGI applications like FastAPI, include the appropriate server package
   (like `uvicorn`) in your dependencies. After deployment, your endpoints become
-  available at `https://api.cortex.cerebrium.ai/v4/{project - id}/{app - name}
+  available at `https://api.aws.us-east-1.cerebrium.ai/v4/{project - id}/{app - name}
   /your/endpoint`.
 </Info>
 
diff --git a/cerebrium/container-images/defining-container-images.mdx b/cerebrium/container-images/defining-container-images.mdx
@@ -282,7 +282,7 @@ vllm = "latest"
 - Code is mounted in `/cortex`—adjust paths accordingly.
 - The port in your entrypoint must match the `port` parameter.
 - Install any required server packages (uvicorn, gunicorn, etc.) via pip dependencies.
-- All endpoints will be available at `https://api.cortex.cerebrium.ai/v4/{project-id}/{app-name}/your/endpoint`.
+- All endpoints will be available at `https://api.aws.us-east-1.cerebrium.ai/v4/{project-id}/{app-name}/your/endpoint`.
 
 Deploy as normal with `cerebrium deploy -y`—the system automatically detects and handles custom runtime configuration.
 
diff --git a/cerebrium/endpoints/async.mdx b/cerebrium/endpoints/async.mdx
@@ -10,7 +10,7 @@ responsibility, while you as the developer are responsible for ensuring that dat
 You can enable your function to execute asynchronously by adding the `async` query parameter to your request and setting it to `true`. This would look something like this:
 
 ```bash
-curl -X POST https://api.cortex.cerebrium.ai/v4/<YOUR-PROJECT-ID>/<YOUR-APP>/run?async=true'\
+curl -X POST https://api.aws.us-east-1.cerebrium.ai/v4/<YOUR-PROJECT-ID>/<YOUR-APP>/run?async=true'\
        -H 'Content-Type: application/json'\
        -H 'Authorization: Bearer <YOUR-JWT-TOKEN>\
        --data '{"param": "hello world"}'
@@ -47,7 +47,7 @@ async execution with a specified `webhookEndpoint`, to have Cerebrium automatica
 the function response once it has returned:
 
 ```bash
-curl -X POST <https://api.cortex.cerebrium.ai/v4/><YOUR-PROJECT-ID>/<YOUR-APP>/run?async=true&webhookEndpoint=https%3A%2F%2Fwebhook.site%2F'\
+curl -X POST <https://api.aws.us-east-1.cerebrium.ai/v4/><YOUR-PROJECT-ID>/<YOUR-APP>/run?async=true&webhookEndpoint=https%3A%2F%2Fwebhook.site%2F'\
  -H 'Content-Type: application/json'\
  -H 'Authorization: Bearer <YOUR-JWT-TOKEN>\
  --data '{"param": "hello world"}'
diff --git a/cerebrium/endpoints/inference-api.mdx b/cerebrium/endpoints/inference-api.mdx
@@ -8,7 +8,7 @@ By default, all functions on Cerebrium are accessible via authenticated POST req
 The POST request follows the structure below, where `{function}` is the name of the function you want to invoke. For example, in this case, the function `predict()` from `main.py` is being called.
 
 ```bash
-curl --location --request POST 'https://api.cortex.cerebrium.ai/v4/p-xxxxx/{app-name}/{function}' \
+curl --location --request POST 'https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxx/{app-name}/{function}' \
 --header 'Authorization: Bearer <JWT_TOKEN>' \
 --header 'Content-Type: application/json' \
 --data '{
diff --git a/cerebrium/endpoints/openai-compatible-endpoints.mdx b/cerebrium/endpoints/openai-compatible-endpoints.mdx
@@ -51,7 +51,7 @@ from openai import OpenAI
 
 client = OpenAI(
     # This is the default and can be omitted
-    base_url="https://api.cortex.cerebrium.ai/v4/p-xxxxx/1-openai-compatible-endpoint/run", ##This is the name of the function you are calling
+    base_url="https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxx/1-openai-compatible-endpoint/run", ##This is the name of the function you are calling
     api_key="<CEREBRIUM_JWT_TOKEN>",
 )
 
diff --git a/cerebrium/endpoints/streaming.mdx b/cerebrium/endpoints/streaming.mdx
@@ -24,7 +24,7 @@ Once you deploy this code snippet and hit the stream endpoint, you will see the
 You can do this as follows:
 
 ```bash
-curl -X POST https://api.cortex.cerebrium.ai/v4/<YOUR-PROJECT-ID>/2-streaming-endpoint/run \
+curl -X POST https://api.aws.us-east-1.cerebrium.ai/v4/<YOUR-PROJECT-ID>/2-streaming-endpoint/run \
        -H 'Content-Type: application/json'\
        -H 'Accept: text/event-stream\
        -H 'Authorization: Bearer <YOUR-JWT-TOKEN>\
diff --git a/cerebrium/endpoints/webhook.mdx b/cerebrium/endpoints/webhook.mdx
@@ -8,7 +8,7 @@ This allows you to use webhooks in your architecture. To achieve this, we can si
 query parameter to any API call:
 
 ```bash
-curl -X POST https://api.cortex.cerebrium.ai/v4/<YOUR-PROJECT-ID>/<YOUR-APP>/run?webhookEndpoint=https%3A%2F%2Fwebhook.site%2F'\
+curl -X POST https://api.aws.us-east-1.cerebrium.ai/v4/<YOUR-PROJECT-ID>/<YOUR-APP>/run?webhookEndpoint=https%3A%2F%2Fwebhook.site%2F'\
        -H 'Content-Type: application/json'\
        -H 'Authorization: Bearer <YOUR-JWT-TOKEN>\
        --data '{"param": "hello world"}'
diff --git a/cerebrium/endpoints/websockets.mdx b/cerebrium/endpoints/websockets.mdx
@@ -35,7 +35,7 @@ Explanation:
 You can test your WebSocket endpoint using websocat, a command-line utility for connecting to WebSocket servers:
 
 ```bash
-websocat wss://api.cortex.cerebrium.ai/v4/<your-project-id>/<your-app-name>/<your-websocket-function-name>
+websocat wss://api.aws.us-east-1.cerebrium.ai/v4/<your-project-id>/<your-app-name>/<your-websocket-function-name>
 ```
 
 ## Implementing the WebSocket Endpoint
@@ -62,7 +62,7 @@ Client-side Implementation: When connecting from a client app, ensure you handle
 ```javascript
 // Example using JavaScript in a browser
 const socket = new WebSocket(
-  "wss://api.cortex.cerebrium.ai/v4/<your-project-id>/<your-app-name>/<your-websocket-function-name>",
+  "wss://api.aws.us-east-1.cerebrium.ai/v4/<your-project-id>/<your-app-name>/<your-websocket-function-name>",
 );
 
 socket.onopen = function (event) {
diff --git a/cerebrium/getting-started/introduction.mdx b/cerebrium/getting-started/introduction.mdx
@@ -62,7 +62,7 @@ cerebrium deploy
 
 This will turn the function into a callable endpoint that accepts json parameters (prompt) and can scale to 1000s of requests automatically!
 
-Once deployed, an app becomes callable through a POST endpoint `https://api.cortex.cerebrium.ai/v4/{project-id}/{app-name}/{function-name}` and takes a json parameter, prompt
+Once deployed, an app becomes callable through a POST endpoint `https://api.aws.us-east-1.cerebrium.ai/v4/{project-id}/{app-name}/{function-name}` and takes a json parameter, prompt
 
 Great! You made it! Join our Community [Discord](https://discord.gg/ATj6USmeE2) for support and updates.
 
diff --git a/cerebrium/integrations/vercel.mdx b/cerebrium/integrations/vercel.mdx
@@ -43,7 +43,7 @@ Once you have followed the example and deployed the app, you should have an outp
 
 ```javascript
 fetch(
-  "https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/mistral-vllm/predict",
+  "https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/mistral-vllm/predict",
   {
     method: "POST",
     headers: {
diff --git a/cerebrium/partner-services/deepgram.mdx b/cerebrium/partner-services/deepgram.mdx
@@ -284,7 +284,7 @@ wget https://dpgr.am/bueller.wav
 9. Access the Deepgram service by calling the endpoint with appropriate parameters such as:
 
 ```curl
-curl -X POST --data-binary @bueller.wav "https://api.cortex.cerebrium.ai/v4/p-xxxxxx/deepgram/v1/listen?model=nova-3&smart_format=true"
+curl -X POST --data-binary @bueller.wav "https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxxx/deepgram/v1/listen?model=nova-3&smart_format=true"
 ```
 
 Parameters accepted by the Deepgram service can be found in the [speech-to-text API reference](https://developers.deepgram.com/reference/speech-to-text-api/listen-streaming).
diff --git a/cerebrium/partner-services/rime.mdx b/cerebrium/partner-services/rime.mdx
@@ -59,7 +59,7 @@ App Dashboard: https://dashboard.cerebrium.ai/projects/p-xxxxxxxx/apps/p-xxxxxxx
 5. Use the Deployment url from the output to send requests to the <b>HTTP</b> Rime service via curl request:
 
 ```
-curl --location 'https://api.cortex.cerebrium.ai/v4/p-xxxxxxxx/rime' \
+curl --location 'https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxxxxx/rime' \
 --header 'Authorization: Bearer <RIME_API_KEY>' \
 --header 'Content-Type: application/json' \
 --header 'Accept: audio/pcm' \
diff --git a/migrations/hugging-face.mdx b/migrations/hugging-face.mdx
@@ -186,7 +186,7 @@ Once deployed, you can use your model as follows:
 import requests
 import json
 
-url = "https://api.cortex.cerebrium.ai/v4/[PROJECT_NAME]/llama-8b-vllm/run"
+url = "https://api.aws.us-east-1.cerebrium.ai/v4/[PROJECT_NAME]/llama-8b-vllm/run"
 
 payload = json.dumps({"prompt": "tell me about yourself"})
 
diff --git a/migrations/mystic.mdx b/migrations/mystic.mdx
@@ -273,7 +273,7 @@ cerebrium deploy
 Once your app is deployed, you can make requests to your model using the example cURL request below:
 
 ```bash
-curl --location 'https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/stable-diffusion/predict' \
+curl --location 'https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/stable-diffusion/predict' \
 --header 'Content-Type: application/json' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --data '{
diff --git a/toml-reference/toml-reference.mdx b/toml-reference/toml-reference.mdx
@@ -77,7 +77,7 @@ The `[cerebrium.runtime.custom]` section configures custom web servers and runti
 
 <Info>
   The port specified in entrypoint must match the port parameter. All endpoints
-  will be available at `https://api.cortex.cerebrium.ai/v4/{project - id}/
+  will be available at `https://api.aws.us-east-1.cerebrium.ai/v4/{project - id}/
   {app - name}/your/endpoint`
 </Info>
 
diff --git a/v4/examples/aiVoiceAgents.mdx b/v4/examples/aiVoiceAgents.mdx
@@ -493,7 +493,7 @@ We created a public fork of the PipeCat frontend to show you a nice demo of this
 Follow the instructions in the README.md and then populate the following variables in your .env.development.local
 
 ```
-VITE_SERVER_URL=https://api.cortex.cerebrium.ai/v4/p-xxxxx/<APP_NAME> #This is the base url. Do not include the function names
+VITE_SERVER_URL=https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxx/<APP_NAME> #This is the base url. Do not include the function names
 VITE_SERVER_AUTH= #This is the JWT token you can get from the API Keys section of your Cerebrium Dashboard.
 ```
 
diff --git a/v4/examples/asgi-gradio-interface.mdx b/v4/examples/asgi-gradio-interface.mdx
@@ -210,7 +210,7 @@ class GradioServer:
         interface.launch(
             server_name=self.host,
             server_port=self.port,
-            root_path=f"https://api.cortex.cerebrium.ai/v4/{os.getenv('PROJECT_ID')}/{os.getenv('APP_NAME')}/",
+            root_path=f"https://api.aws.us-east-1.cerebrium.ai/v4/{os.getenv('PROJECT_ID')}/{os.getenv('APP_NAME')}/",
             quiet=True
         )
 
@@ -350,7 +350,7 @@ class GradioServer:
         interface.launch(
             server_name=self.host,
             server_port=self.port,
-            root_path=f"https://api.cortex.cerebrium.ai/v4/{os.getenv('PROJECT_ID')}/{os.getenv('APP_NAME')}/",
+            root_path=f"https://api.aws.us-east-1.cerebrium.ai/v4/{os.getenv('PROJECT_ID')}/{os.getenv('APP_NAME')}/",
             quiet=True
         )
 
@@ -445,7 +445,7 @@ cerebrium deploy -y
 Once deployed, navigate to the following URL in your browser:
 
 ```
-https://api.cortex.cerebrium.ai/v4/p-<YOUR_PROJECT_ID>/2-gradio-interface/
+https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR_PROJECT_ID>/2-gradio-interface/
 ```
 
 You should see the Gradio chat interface.
diff --git a/v4/examples/comfyUI.mdx b/v4/examples/comfyUI.mdx
@@ -343,7 +343,7 @@ Now you can now deploy your application by running: `cerebrium deploy`
 Once your ComfyUI application has been deployed successfully, you should be able to make a request to the endpoint using the following JSON payload:
 
 ```curl
-curl --location 'https://api.cortex.cerebrium.ai/v4/p-xxxx/1-comfyui/run' \
+curl --location 'https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxx/1-comfyui/run' \
 --header 'Content-Type: application/json' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --data '{"workflow_values": {
diff --git a/v4/examples/langchain.mdx b/v4/examples/langchain.mdx
@@ -198,7 +198,7 @@ cerebrium deploy
 After deployment, make this request:
 
 ```curl
-curl --location --request POST 'https://api.cortex.cerebrium.ai/v4/p-xxxxxx/1-langchain-QA/predict' \
+curl --location --request POST 'https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxxx/1-langchain-QA/predict' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --header 'Content-Type: application/json' \
 --data-raw '{
diff --git a/v4/examples/mistral-vllm.mdx b/v4/examples/mistral-vllm.mdx
@@ -158,7 +158,7 @@ cerebrium deploy
 After deployment, make this request:
 
 ```curl
-curl --location --request POST 'https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/1-faster-inference-with-vllm/predict' \
+curl --location --request POST 'https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/1-faster-inference-with-vllm/predict' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --header 'Content-Type: application/json' \
 --data-raw '{
diff --git a/v4/examples/openai-compatible-endpoint-vllm.mdx b/v4/examples/openai-compatible-endpoint-vllm.mdx
@@ -117,7 +117,7 @@ cerebrium deploy
 After deployment, you'll see a curl command like this:
 
 ```curl
-curl --location 'https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/5-openai-compatible-endpoint/{function}' \
+curl --location 'https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/5-openai-compatible-endpoint/{function}' \
 --header 'Content-Type: application/json' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --data '{"..."}'
@@ -130,7 +130,7 @@ import os
 from openai import OpenAI
 
 client = OpenAI(
-    base_url="https://api.cortex.cerebrium.ai/v4/p-xxxxxxx/5-openai-compatible-endpoint/run",
+    base_url="https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxxxx/5-openai-compatible-endpoint/run",
     api_key="<CEREBRIUM_JWT_TOKEN>",
 )
 
diff --git a/v4/examples/realtime-voice-agents.mdx b/v4/examples/realtime-voice-agents.mdx
@@ -470,7 +470,7 @@ We created a public fork of the PipeCat frontend to show you a nice demo of this
 Follow the instructions in the README.md and then populate the following variables in your .env.development.local
 
 ```
-VITE_SERVER_URL=https://api.cortex.cerebrium.ai/v4/p-xxxxx/<APP_NAME> #This is the base url of your pipecat-agent. Do not include the function names
+VITE_SERVER_URL=https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxx/<APP_NAME> #This is the base url of your pipecat-agent. Do not include the function names
 VITE_SERVER_AUTH= #This is the JWT token you can get from the API Keys section of your Cerebrium Dashboard.
 ```
 
diff --git a/v4/examples/sdxl.mdx b/v4/examples/sdxl.mdx
@@ -149,7 +149,7 @@ cerebrium deploy
 After deployment, make this request:
 
 ```curl
-curl --location 'https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/3-sdxl-refiner/predict' \
+curl --location 'https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/3-sdxl-refiner/predict' \
 --header 'Content-Type: application/json' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --data '{
diff --git a/v4/examples/streaming-falcon-7B.mdx b/v4/examples/streaming-falcon-7B.mdx
@@ -186,7 +186,7 @@ After deployment, make this request:
 </Note>
 
 ```curl
-curl --location --request POST 'https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/5-streaming-endpoint/stream' \
+curl --location --request POST 'https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/5-streaming-endpoint/stream' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --header 'Content-Type: application/json' \
 --data-raw '{
diff --git a/v4/examples/tensorRT.mdx b/v4/examples/tensorRT.mdx
@@ -320,7 +320,7 @@ Now that our code is ready, deploy the app with the command: `cerebrium deploy`.
 Initial deployment takes about 15-20 minutes to install packages, download the model, and convert it to the TensorRT-LLM format. Once completed, it outputs a curl command you can use to test your inference endpoint.
 
 ```
-curl --location 'https://api.cortex.cerebrium.ai/v4/p-xxxxxx/predict' \\
+curl --location 'https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxxx/predict' \\
 --header 'Content-Type: application/json' \\
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \\
 --data '{"prompt": "Tell me about yourself?"}'
diff --git a/v4/examples/transcribe-whisper.mdx b/v4/examples/transcribe-whisper.mdx
@@ -148,7 +148,7 @@ cerebrium deploy
 After deployment, make this request:
 
 ```curl
-curl --location 'https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/1-whisper-transcription/predict' \
+curl --location 'https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/1-whisper-transcription/predict' \
 --header 'Content-Type: application/json' \
 --header 'Authorization: Bearer <YOUR TOKEN HERE>' \
 --data '{"file_url": "https://your-public-url.com/test.mp3"}''
diff --git a/v4/examples/twilio-voice-agent.mdx b/v4/examples/twilio-voice-agent.mdx
@@ -88,7 +88,7 @@ Create a `templates` folder with `stream.xml` inside. This XML response tells Tw
 <Response>
   <Connect>
     <!--Update with your project ID below-->
-    <Stream url="wss://api.cortex.cerebrium.ai/v4/p-xxxxxxx/4-twilio-agent/ws"></Stream>
+    <Stream url="wss://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxxxx/4-twilio-agent/ws"></Stream>
   </Connect>
   <Pause length="40"/>
 </Response>
diff --git a/v4/examples/wandb-sweep.mdx b/v4/examples/wandb-sweep.mdx
@@ -322,7 +322,7 @@ from dotenv import load_dotenv
 load_dotenv()
 
 CEREBRIUM_API_KEY = os.getenv("CEREBRIUM_API_KEY")
-ENDPOINT_URL = "https://api.cortex.cerebrium.ai/v4/p-xxxxx/wandb-sweep/train_model?async=true"
+ENDPOINT_URL = "https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxx/wandb-sweep/train_model?async=true"
 
 def train_with_params(params: Dict[str, Any]):
     """

Original file line number	Diff line number	Diff line change
`@@ -51,7 +51,7 @@ from openai import OpenAI`
`51`	`51`
`52`	`52`	`client = OpenAI(`
`53`	`53`	`# This is the default and can be omitted`
`54`		`- base_url="https://api.cortex.cerebrium.ai/v4/p-xxxxx/1-openai-compatible-endpoint/run", ##This is the name of the function you are calling`
	`54`	`+ base_url="https://api.aws.us-east-1.cerebrium.ai/v4/p-xxxxx/1-openai-compatible-endpoint/run", ##This is the name of the function you are calling`
`55`	`55`	`api_key="<CEREBRIUM_JWT_TOKEN>",`
`56`	`56`	`)`
`57`	`57`
Original file line number	Diff line number	Diff line change
`@@ -43,7 +43,7 @@ Once you have followed the example and deployed the app, you should have an outp`
`43`	`43`
`44`	`44`	```javascript
`45`	`45`	`fetch(`
`46`		`- "https://api.cortex.cerebrium.ai/v4/p-<YOUR PROJECT ID>/mistral-vllm/predict",`
	`46`	`+ "https://api.aws.us-east-1.cerebrium.ai/v4/p-<YOUR PROJECT ID>/mistral-vllm/predict",`
`47`	`47`	`{`
`48`	`48`	`method: "POST",`
`49`	`49`	`headers: {`