Skip to content

Integrate GPT OSS Safeguard into ThreatExchange as chat command#1929

Open
ZhiyLiu wants to merge 9 commits intofacebook:mainfrom
ZhiyLiu:integrate_gpt_oss_safeguard
Open

Integrate GPT OSS Safeguard into ThreatExchange as chat command#1929
ZhiyLiu wants to merge 9 commits intofacebook:mainfrom
ZhiyLiu:integrate_gpt_oss_safeguard

Conversation

@ZhiyLiu
Copy link

@ZhiyLiu ZhiyLiu commented Feb 4, 2026

No description provided.

@ZhiyLiu ZhiyLiu requested a review from Dcallies as a code owner February 4, 2026 02:52
@meta-cla meta-cla bot added the CLA Signed label Feb 4, 2026
@github-actions github-actions bot added the python-threatexchange Items related to the threatexchange python tool / library label Feb 4, 2026
Copy link
Contributor

@Dcallies Dcallies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's chat offline, we should be using the classifier interface for this instead.

@ZhiyLiu ZhiyLiu requested a review from Dcallies February 4, 2026 19:05
2. moved the policy into classifier folder
3. merged the chat functionality into classify command
@ZhiyLiu
Copy link
Author

ZhiyLiu commented Feb 4, 2026

Test Plan:
export openai key

threatexchange classify modapi -s 'I hate you'
Expected result:
{
"parsed": {
"action": "allow",
"category": "none",
"confidence": 0.97,
"label": "ALLOW",
"rationale": "The statement is a simple expression of dislike without targeting protected groups or containing disallowed content."
},
"raw_text": "{"label":"ALLOW","action":"allow","category":"none","confidence":0.97,"rationale":"The statement is a simple expression of dislike without targeting protected groups or containing disallowed content."}"
}

@ZhiyLiu ZhiyLiu requested a review from Dcallies February 4, 2026 21:08
@Dcallies
Copy link
Contributor

Dcallies commented Feb 9, 2026

Hey @ZhiyLiu , I'm a bit backlogged, but I will come back to this when I have free time to try and clean it up for merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed python-threatexchange Items related to the threatexchange python tool / library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants