Any way to set larger context length when using Ollama models? #5254

rylativity · 2025-01-29T19:47:58Z

rylativity
Jan 29, 2025

I have successfully set up autogen to use Ollama and create agents and teams. I am currently using Ollama as the inference engine behind a MagneticOneGroupChat team and it is working well.

However, when I look at the Ollama model resource usage, it is clear (based on RAM/vRAM usage) that it is only using the default context length of 2048, even though the relevant context extends far beyond that context length limit as the MagneticOne team continues to work and create additional output. There is no error thrown, but I have seen a couple instances where the team attempt to perform tasks they have already completed further up in the team "chat", which (when combined with the evidence from RAM/vRAM usage) leads me to believe that any context beyond 2048 tokens is being silently truncated.

Is there a way to set an explicit, increased context length when instantiating an OpenAIChatCompletionClient, Agent, or Team? (Relevant documentation for using Ollama with Autogen here)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any way to set larger context length when using Ollama models? #5254

{{title}}

Replies: 0 comments

Select a reply

Any way to set larger context length when using Ollama models? #5254

rylativity Jan 29, 2025

Replies: 0 comments

rylativity
Jan 29, 2025