Single 120b model or many 20b models? #46

turt2live · 2026-01-06T16:52:58Z

turt2live
Jan 6, 2026

Hey all, we're continuing to experiment with gpt-oss-safeguard and have near-zero experience with running models ourselves. Our use case is to deploy the model in an online chat scenario as an evaluator of message content when our other filters aren't sure whether content is allowable in a room. We'd be starting with a few rooms at first, but looking to expand to more rooms as we build confidence in safeguard's accuracy (and our policy's content). This would be less than 1Hz of traffic hitting the model at first, but would grow to about 3-5Hz over time (possibly 60Hz+ at full scale).

It's our understanding that the 120b model is more accurate, but has higher latency, than the 20b model, but we're not sure what that means in practice. We're likely only going to get a single GPU to experiment with at first, so the question would be: should we start our experimentation with a single 120b model or deploy a few 20b models on that GPU?

We're moderately inclined to try deploying multiple 20b models, but would like an informed opinion before we start pressing buttons :)

If interested, our use of safeguard feels pretty standard, though we're embedding it in a highly domain-specific area: matrix-org/policyserv#59

vb-openai · 2026-01-09T14:34:32Z

vb-openai
Jan 9, 2026

hey hey - Overall the 20B is well suited for your use-cases specially as you want a lot more room to scale (60x peak). The only thing I'd recommend is to define your policy in a more well-defined way i.e. have clear case of what's acceptable vs what isn't; instead of overly broad language!

From a cost perspective it's much more efficient to deploy the 20B, you can always scale up to 120B if the current model is unable to work with your safety complexities.

1 reply

turt2live Jan 12, 2026
Author

Thanks! We'll take a look at deploying the 20b model and see how it performs :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single 120b model or many 20b models? #46

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Single 120b model or many 20b models? #46

Uh oh!

turt2live Jan 6, 2026

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

vb-openai Jan 9, 2026

Uh oh!

turt2live Jan 12, 2026 Author

turt2live
Jan 6, 2026

Replies: 1 comment 1 reply

vb-openai
Jan 9, 2026

turt2live Jan 12, 2026
Author