Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of hallucinations? #2

Open
joshmouch opened this issue Jan 18, 2025 · 0 comments
Open

Lots of hallucinations? #2

joshmouch opened this issue Jan 18, 2025 · 0 comments

Comments

@joshmouch
Copy link

joshmouch commented Jan 18, 2025

I know this is a simple demo, but I was expecting fewer hallucinations and something I could hypothetically use in a non demo app.

Random switching to other languages. Thinking i spelled Jose when i said J-o-s-h. Mr. Butte when i spell b-u-t-t. ;). Phone numbers incorrectly formatted. And most tool calls never getting used... like I asked to be signed up for a promotion because I saw there was a tool for it, but I never got it to call the tool.

Is this maybe holding out for o1 or o3 before it would work in a production scenario? Is it maybe expected that the models should be fine tuned in a real usage scenario?

Im wondering if maybe in practice there needs to be some fine tuning that occurs before this is used. But if that's the case then I think the training instructions and expectations should be included with the demo. Maybe a way to evaluate how well the agents are working?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant