$ go install github.com/CloudNativeAI/modctl@latest
$ git clone https://github.com/CloudNativeAI/modctl.git
$ make
$ ./output/modctl -h
Generate a Modelfile for the model artifact in the current directory(workspace),
you need go to the directory where the model artifact is located and
run the following command. Then the Modelfile
will be generated in the current
directory(workspace).
$ modctl modelfile generate .
Build the model artifact you need to prepare a Modelfile describe your expected layout of the model artifact in your model repo.
Example of Modelfile:
# Model name (string), such as llama3-8b-instruct, gpt2-xl, qwen2-vl-72b-instruct, etc.
NAME gemma-2b
# Model architecture (string), such as transformer, cnn, rnn, etc.
ARCH transformer
# Model family (string), such as llama3, gpt2, qwen2, etc.
FAMILY gemma
# Model format (string), such as onnx, tensorflow, pytorch, etc.
FORMAT safetensors
# Number of parameters in the model (integer).
PARAMSIZE 16
# Model precision (string), such as bf16, fp16, int8, etc.
PRECISION bf16
# Model quantization (string), such as awq, gptq, etc.
QUANTIZATION awq
# Specify model configuration file, support glob path pattern.
CONFIG config.json
# Specify model configuration file, support glob path pattern.
CONFIG generation_config.json
# Model weight, support glob path pattern.
MODEL *.safetensors
# Specify code, support glob path pattern.
CODE *.py
# Specify documentation, support glob path pattern.
DOC *.md
Then run the following command to build the model artifact:
$ modctl build -t registry.com/models/llama3:v1.0.0 -f Modelfile .
Before the pull
or push
command, you need to login the registry:
$ modctl login -u username -p password example.registry.com
Pull the model artifact from the registry:
$ modctl pull registry.com/models/llama3:v1.0.0
Push the model artifact to the registry:
$ modctl push registry.com/models/llama3:v1.0.0
Extract the model artifact to the specified directory:
$ modctl extract registry.com/models/llama3:v1.0.0 --output /path/to/extract
List the model artifacts in the local storage:
$ modctl ls
Delete the model artifact in the local storage:
$ modctl rm registry.com/models/llama3:v1.0.0
Finally, you can use prune
command to remove all unnecessary blobs to free up the storage space:
$ modctl prune