Any way to set larger context length when using Ollama models? #5254
rylativity
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have successfully set up autogen to use Ollama and create agents and teams. I am currently using Ollama as the inference engine behind a MagneticOneGroupChat team and it is working well.
However, when I look at the Ollama model resource usage, it is clear (based on RAM/vRAM usage) that it is only using the default context length of 2048, even though the relevant context extends far beyond that context length limit as the MagneticOne team continues to work and create additional output. There is no error thrown, but I have seen a couple instances where the team attempt to perform tasks they have already completed further up in the team "chat", which (when combined with the evidence from RAM/vRAM usage) leads me to believe that any context beyond 2048 tokens is being silently truncated.
Is there a way to set an explicit, increased context length when instantiating an OpenAIChatCompletionClient, Agent, or Team? (Relevant documentation for using Ollama with Autogen here)
Beta Was this translation helpful? Give feedback.
All reactions