Lots of hallucinations? #2

joshmouch · 2025-01-18T05:53:04Z

I know this is a simple demo, but I was expecting fewer hallucinations and something I could hypothetically use in a non demo app.

Random switching to other languages. Thinking i spelled Jose when i said J-o-s-h. Mr. Butte when i spell b-u-t-t. ;). Phone numbers incorrectly formatted. And most tool calls never getting used... like I asked to be signed up for a promotion because I saw there was a tool for it, but I never got it to call the tool.

Is this maybe holding out for o1 or o3 before it would work in a production scenario? Is it maybe expected that the models should be fine tuned in a real usage scenario?

Im wondering if maybe in practice there needs to be some fine tuning that occurs before this is used. But if that's the case then I think the training instructions and expectations should be included with the demo. Maybe a way to evaluate how well the agents are working?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lots of hallucinations? #2

Lots of hallucinations? #2

joshmouch commented Jan 18, 2025 •

edited

Loading

Lots of hallucinations? #2

Lots of hallucinations? #2

Comments

joshmouch commented Jan 18, 2025 • edited Loading

joshmouch commented Jan 18, 2025 •

edited

Loading