Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Support new OpenAI o1 reasoning models #575

Closed
abatilo opened this issue Sep 12, 2024 · 11 comments
Closed

feature: Support new OpenAI o1 reasoning models #575

abatilo opened this issue Sep 12, 2024 · 11 comments
Labels
enhancement New feature or request

Comments

@abatilo
Copy link
Contributor

abatilo commented Sep 12, 2024

Feature request

I would like if Avante could prompt against the OAI reasoning models

Motivation

These reasoning models are allegedly more capable at coding.

Other

There are a few differences in the API now:

  1. temperature must be set to 1 if you're using either o1-preview or o1-mini models.
  2. max_tokens is not used. Instead max_completion_tokens must be used in the config
  3. These models do not support a system role/message. I think we need to remove this line
  4. Streaming is not supported, so I believe we need to remove this line.

Even with these changes, I'm not getting a successful end to end flow. I've never contributed to the avante.nvim codebase and I'm not entirely sure what else to try at the moment to get things working.

So far, my total diffs look like so:

diff --git a/lua/avante/config.lua b/lua/avante/config.lua
index c1689d7..82d3477 100644
--- a/lua/avante/config.lua
+++ b/lua/avante/config.lua
@@ -30,8 +30,8 @@ You are an excellent programming expert.
     endpoint = "https://api.openai.com/v1",
     model = "gpt-4o",
     timeout = 30000, -- Timeout in milliseconds
-    temperature = 0,
-    max_tokens = 4096,
+    temperature = 1,
+    max_completion_tokens = 4096,
     ["local"] = false,
   },
   ---@type AvanteSupportedProvider
diff --git a/lua/avante/providers/openai.lua b/lua/avante/providers/openai.lua
index 52e62b1..888d466 100644
--- a/lua/avante/providers/openai.lua
+++ b/lua/avante/providers/openai.lua
@@ -51,7 +51,6 @@ M.parse_message = function(opts)
   end

   return {
-    { role = "system", content = opts.system_prompt },
     { role = "user", content = user_content },
   }
 end
@@ -91,7 +90,6 @@ M.parse_curl_args = function(provider, code_opts)
     body = vim.tbl_deep_extend("force", {
       model = base.model,
       messages = M.parse_message(code_opts),
-      stream = true,
     }, body_opts),
   }
 end

I've gotten this far by trying to use the openai provider and seeing it fail and return an error message. This time, it's not returning anything. I see Generating response ... and it never changes.

@abatilo abatilo added the enhancement New feature or request label Sep 12, 2024
@Alextibtab
Copy link

I assume they'll open it up in the near future but currently to use the reasoning models via the API you need to hit tier 5 https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-five meaning you have to have bought $1000 in tokens

@cfcosta
Copy link

cfcosta commented Sep 13, 2024

@Alextibtab if you need to test anything I have this level of API access.

@aarnphm
Copy link
Collaborator

aarnphm commented Sep 13, 2024

lmao if you have tier 5 then feel free to use it. For now we have to wait till GA for API usage.

btw the API costs would be pretty high for o1 from my testing.

@oskarpyk
Copy link

@abatilo I have access and just tried to implement your diff-- experiencing the same freeze at Generating response... unfortunately. Presumably there's some deeper logic in Avante interfering with the o1 non-streamed response mechanic?

@aarnphm
Copy link
Collaborator

aarnphm commented Sep 25, 2024

I mean they haven't even publish o1 API yet, so there is nothing we can do for sure.

I suspect it is still streaming, just that we need to figure out how to display CoT reasoning

@abatilo
Copy link
Contributor Author

abatilo commented Sep 25, 2024

@aarnphm There might be a misunderstanding. There is an API but it's closed to certain tiers. It's mostly the same as the current chat completion endpoint but doesn't support streaming. According to OAI, they don't plan on returning the full CoT tokens. Unless maybe they've changed their mind.

I think we would mostly just need to add code paths for handling non-streaming results to make this work

@aarnphm
Copy link
Collaborator

aarnphm commented Sep 25, 2024

There is an API but it's closed to certain tiers.

Yes the API is open for tier 5 and up. What I'm referring to is on their API reference they have yet to update the example for o1, or specify specific data_type for SSE during CoT.

I don't think they will ever publish CoT tokens (that is their moat apparently). But this is irrelevant in this case.

The chat after CoT are still streaming afaict

@LessComplexity
Copy link
Contributor

I've added a pull request that adds O1 support, without interfering with other models, and also solving the response hanging issue people had experienced in this thread.
Waiting for a review and approval so that people with tier 5 API access could enjoy it too :)

Have an awesome day guys <3

@oskarpyk
Copy link

Well done @LessComplexity ! Fantastic work

@LessComplexity
Copy link
Contributor

@aarnphm
This issue can be closed as my commit already adds o1 models support :)

@aarnphm
Copy link
Collaborator

aarnphm commented Sep 28, 2024

thanks

@aarnphm aarnphm closed this as completed Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants