A lightweight microservice that provides text generation capabilities using Hugging Face models via a RESTful API.
- Simple REST API for text generation
- Uses Hugging Face Transformers for state-of-the-art language models
- Containerized with Docker for easy deployment
- Automatic GPU detection and utilization when available
- Docker
-
Build the Docker image:
docker build -t opt-125m-microservice .
-
Run the container:
docker run -p 8001:8000 opt-125m-microservice
Note: If port 8001 is already in use, you can change it to any available port.
-
The service will be available at
http://localhost:8001
GET /
Returns the status of the service and the model being used.
Example response:
{
"status": "ok",
"model": "facebook/opt-125m"
}
POST /generate
Request body:
{
"prompt": "Once upon a time",
"max_new_tokens": 50
}
Parameters:
prompt
: The input text to generate frommax_new_tokens
: Maximum number of tokens to generate (default: 50)
Example response:
{
"generated_text": "Once upon a time, there was a young princess who lived in a castle..."
}
The service uses the following environment variables:
- None required for the default public model
- Clone the repository
- Install dependencies:
pip install -r requirements.txt
- Run the service:
uvicorn main:app --host 0.0.0.0 --port 8000
MIT