PrivacyBench

This repository is dedicated to benchmarking legal and privacy-related performance of generative AI models and is used to enable appropriate and effective use of AI to assist with legal and compliance matters.

Purpose of PrivacyBench

Existing industry benchmarks, such as the , provide strong indications of linguistic understanding but are not specifically tuned to measure legal, privacy, and compliance tasks.
Existing benchmarks are not adequately specific to legal, privacy and compliance tasks.

Proposed Solution

Develop a testing method for benchmarking performance in personal data redaction.
Develop and report on LLM performance.
Identify, in particular, LLM models that can be deployed locally and efficiently for maximum privacy and security and lowest cost.
Encourage the community development of better tools through benchmarking.

Call for contributions

This repository is open-sourced under MIT license and the code and testing process is free to use with appropriate credit attribution (subject to third-party licenses).

Specific Tasks

The first use case selected to be benchmarked and tested is personal data detection and redaction; please see this task in this repo for additional details.

Trademark

PrivacyBench is a trademark of Alex J. Wall.

About the models referenced in result tables

The benchmark code, questions, methodology, and per-question result rows in this repository are MIT-licensed (see LICENSE).
Specific models appearing in those result tables are owned by their respective publishers and licensed separately. The caiioo-research/* models published by Six Cailloux, LLC. (the company that maintains this benchmark) are proprietary and not currently available for download — they appear in result tables as reference points, not as openly redistributable artifacts.
See MODELS.md for the full list of models referenced, where each can be obtained, what license each falls under, and which are proprietary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PrivacyBench

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

PrivacyBench