Skip to content

Caiioo-ai/privacybench

Repository files navigation

PrivacyBench

This repository is dedicated to benchmarking legal and privacy-related performance of generative AI models and is used to enable appropriate and effective use of AI to assist with legal and compliance matters.

Purpose of PrivacyBench

  • Existing industry benchmarks, such as the MMLU, provide strong indications of linguistic understanding but are not specifically tuned to measure legal, privacy, and compliance tasks.
  • Existing benchmarks are not adequately specific to legal, privacy and compliance tasks.

Proposed Solution

  • Develop a testing method for benchmarking performance in personal data redaction.
  • Develop and report on LLM performance.
  • Identify, in particular, LLM models that can be deployed locally and efficiently for maximum privacy and security and lowest cost.
  • Encourage the community development of better tools through benchmarking.

Call for contributions

  • This repository is open-sourced under MIT license and the code and testing process is free to use with appropriate credit attribution (subject to third-party licenses).

Specific Tasks

  • The first use case selected to be benchmarked and tested is personal data detection and redaction; please see this task in this repo for additional details.

Trademark

  • PrivacyBench is a trademark of Alex J. Wall.

About the models referenced in result tables

  • The benchmark code, questions, methodology, and per-question result rows in this repository are MIT-licensed (see LICENSE).
  • Specific models appearing in those result tables are owned by their respective publishers and licensed separately. The caiioo-research/* models published by Six Cailloux, LLC. (the company that maintains this benchmark) are proprietary and not currently available for download — they appear in result tables as reference points, not as openly redistributable artifacts.
  • See MODELS.md for the full list of models referenced, where each can be obtained, what license each falls under, and which are proprietary.

About

Repository dedicated to developing and testing privacy legal and compliance scoring of AI models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors