Skip to content

A collection of detectors that are supported by the TrustyAI team in the context of the FMS Guardrails Orchestrator project

License

Notifications You must be signed in to change notification settings

trustyai-explainability/guardrails-detectors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Detector Algorithms for the FMS Guardrails Orchestrator

FMS Guardrails Orchestrator is an open source project led by IBM which provides a server for invocation of detectors on text generation input and output, and standalone detections.

This repository is intended to provide a collection of detector algorithms and microservices that are supported by the TrustyAI team.

Detectors

At the moment, the following detectors are supported:

  • huggingface -- a generic detector class that is intended to be compatible with any AutoModelForSequenceClassification or a specific kind of AutoModelForCausalLM, namely GraniteForCausalLM; this detector exposes /api/v1/text/contents and thus, could be configured to be a detector of type: text_contents within the FMS Guardrails Orchestrator framework. This detector is also intended to be deployed as a KServe inference service.
  • llm_judge -- Integrates the vLLM Judge library to use LLM-as-a-judge based guardrailing architecture
  • builtIn -- Small, lightweight detection functions that are deployed out-of-the-box alongside the Guardrails Orchestrator. The built-in detectors provide a number of heuristic or algorithmic detection functions, such as:
    • Regex-based detections, with pre-written regexes for flagging various Personally Identifiable Information items like emails or phone numbers, as well as the ability to provide custom regexes
    • File-type validations, for verifying if model input/output is valid JSON, XML, or YAML

Building

To build the detector images, use the following commands:

Detector Build Command
huggingface podman build -t $TAG -f detectors/Dockerfile.hf detectors
llm_judge podman build -t $TAG -f detectors/Dockerfile.judge detectors
builtIn podman build -t $TAG -f detectors/Dockerfile.builtIn detectors

Replace $TAG with your desired image tag (e.g., my-detector:latest).

Running locally

Quick Start Commands

Detector Run Command Notes
builtIn podman run -p 8080:8080 $BUILT_IN_IMAGE Ready to use
huggingface podman run -p 8000:8000 -e MODEL_DIR=/mnt/models/$MODEL_NAME -v $MODEL_PATH:/mnt/models/$MODEL_NAME:Z $HF_IMAGE Requires model download
llm_judge podman run -p 8000:8000 -e VLLM_BASE_URL=$LLM_SERVER_URL $LLM_JUDGE_IMAGE Requires OpenAI-compatible LLM server

Detailed Setup Instructions & Examples

  • Built-in detector: No additional setup required. Check out built-in detector examples to see how to use the built-in detectors for file type validation and personally identifiable information (PII) detection
  • Hugging Face detector: Check out Hugging Face detector examples for a complete setup and examples on how to use the Hugging Face detectors for detecting toxic content and prompt injection
  • LLM Judge detector: Check out LLM Judge detector examples for a complete setup and examples on how to use any OpenAI API compatible LLM for content assessment with built-in metrics and custom natural-language criteria

API

See IBM Detector API

License

This project is licensed under the Apache License Version 2.0 - see the LICENSE file for details.

About

A collection of detectors that are supported by the TrustyAI team in the context of the FMS Guardrails Orchestrator project

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages