copyright

lastupdated

keywords

subcollection

years
2025, 2025

2025-04-17

instructlab, ai

instructlab

Deploying models for {{site.data.keyword.short_name}}

{: #deploy}

You can choose to deploy the model.

Deploying the model to Watsonx on {{site.data.keyword.cloud_notm}}

{: #deploy-watson}

Sign up for IBM watsonx as a Service.
Log in and create an API key.
If you do not have one yet, create a project.
Add a connection to the {{site.data.keyword.cos_short}} data source in {{site.data.keyword.cloud_notm}}.
Import the model.
Deploy the model.
Test the deployment.

Deploying the model to RHEL-AI on {{site.data.keyword.cloud_notm}}

{: #deploy-rhel-ai}

Install RHEL-AI on {{site.data.keyword.cloud_notm}}.
Using the {{site.data.keyword.cloud_notm}} CLI, get a bearer token.
```
ibmcloud iam oauth-tokens
```
{: pre}

Update the variables from this bash script and run it.

#!/usr/bin/env bash
# Replace variable with the bearer token
BEARER_TOKEN="XXX"
# Replace variable with the Object Storage bucket name
CUSTOMER_BUCKET="XXX"
# Replace variable with the Object Storage endpoint
COS_ENDPOINT=https://s3.direct.us-east.cloud-object-storage.appdomain.cloud
# Replace variable with the model ID
MODEL_PREFIX="trained_models/XXX/model/"
# Replace variable with the model directory path
MODEL_DIR=/root/model/modeltest
curl -v -G "$COS_ENDPOINT/$CUSTOMER_BUCKET" --data-urlencode "list-type=2" --data-urlencode "prefix=$MODEL_PREFIX" -H "Authorization: Bearer $BEARER_TOKEN" >/tmp/rawxml.txt
cat /tmp/rawxml.txt | awk '{split($0,a,"<Key>"); for (i=1; i<=length(a); i++)  print a[i]}' >/tmp/keysonnewline.txt
mkdir -p "$MODEL_DIR"
while read -r line; do
    if [[ "$line" != "trained_models"* ]]; then
        continue
    fi
    KEY_TO_DOWNLOAD=$(echo "$line" | awk -F '<' '{print $1}')
    FILE_NAME=$(basename "$KEY_TO_DOWNLOAD")
    curl -X "GET" "$COS_ENDPOINT/$CUSTOMER_BUCKET/$KEY_TO_DOWNLOAD" -H "Authorization: Bearer $BEARER_TOKEN" >"${MODEL_DIR}/$FILE_NAME"
done </tmp/keysonnewline.txt

{: pre}

Then use the ilab commands to serve and chat.

ilab model serve --model-path $MODEL_DIR -- --tensor-parallel-size 1 --host 0.0.0.0 --port 8080

{: pre}

ilab model chat --endpoint-url http://localhost:8080/v1 -m $MODEL_DIR

{: pre}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deploy.md

deploy.md

Deploying models for {{site.data.keyword.short_name}}

Deploying the model to Watsonx on {{site.data.keyword.cloud_notm}}

Deploying the model to RHEL-AI on {{site.data.keyword.cloud_notm}}

Files

deploy.md

Latest commit

History

deploy.md

File metadata and controls

Deploying models for {{site.data.keyword.short_name}}

Deploying the model to Watsonx on {{site.data.keyword.cloud_notm}}

Deploying the model to RHEL-AI on {{site.data.keyword.cloud_notm}}