Temperature setting not applied correctly when using OpenAI Compatible Provider with LocalLLM URL

### App Version

3.25.17 

### API Provider

OpenAI Compatible

### Model Used

N/A

### Roo Code Task Links (Optional)

When configuring RooCode with an OpenAI Compatible Provider pointing at a LocalLLM (LiteLLM → vLLM stack), the behavior of the “Use custom temperature” option in the Provider → Advanced settings does not match expectations.
- If the box is unchecked, all requests are sent with temperature: 0.0 (greedy decoding).
- If the box is checked, RooCode uses the UI-selected value (e.g. 0.7).

This effectively means that leaving the box unchecked forces temperature=0.0, instead of passing through the model/provider’s default temperature (e.g. LiteLLM’s configured defaults).

### 🔁 Steps to Reproduce

1. Set up RooCode with:
- Provider: OpenAI Compatible
- LocalLLM URL: pointing to LiteLLM, which forwards to vLLM with Qwen3-Coder.
- LiteLLM alias configured with a default temperature: 0.3.
2. In RooCode, leave “Use custom temperature” unchecked.
- Send a request.
- Observe vLLM logs show temperature=0.0.
vllm logs:
```
params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=18577, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None), prompt_token_ids: None, prompt_embeds shape: None, lora_request: None.
```

3. In RooCode, check “Use custom temperature” and set slider to 0.7.
- Send a request.
- Observe vLLM logs show temperature=0.7.
vllm logs:
```
params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.7, top_p=0.95, top_k=20, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=17875, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None), prompt_token_ids: None, prompt_embeds shape: None, lora_request: None.
```

4. Send a request directly to LiteLLM with curl (bypassing RooCode).
- Observe vLLM logs show the correct default temperature=0.3.
```
❯ curl https://ai.stackq.com/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer sk-xyz"   -d '{
    "model":"coder_api",
    "messages":[{"role":"user","content":"Say hi"}]
  }'
```
vllm logs:
```
params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.3, top_p=0.95, top_k=20, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=32758, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None), prompt_token_ids: None, prompt_embeds shape: None, lora_request: None.
```

### 💥 Outcome Summary

### Expected Behavior
- When “Use custom temperature” is unchecked, RooCode should not send any temperature field in the request payload.
- This would allow the backend (LiteLLM, vLLM, or model defaults) to control temperature.
### Actual Behavior
- RooCode always sends temperature: 0.0 when the box is unchecked, which overrides backend defaults and forces greedy decoding.
### Impact
- Causes confusion when backend defaults (LiteLLM/vLLM) are configured, since RooCode silently overrides them with 0.0.
- Results in responses that are more deterministic and often brittle compared to expected backend behavior.
- Users who expect model default sampling settings never see them unless they manually enable “Use custom temperature.”
### Proposed Fix
- When “Use custom temperature” is unchecked, RooCode should omit the temperature parameter in API requests.
- Only include temperature when the option is explicitly checked and set by the user.

### 📄 Relevant Logs or Errors (Optional)

```shell
.env:

MODEL="cpatonn/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit"
MODEL_ALIAS="coder"
MODEL_CONTEXT_SIZE="32K"
MODEL_MAX_REQUESTS="2"
MODEL_MAX_TOKENS="3K"
MODEL_KVCACHE_TYPE="auto"
MODEL_GPU_UTIL="0.94"
MODEL_TOOL_TYPE="qwen3_coder"


litellm config:

model_list:
  - model_name: ${MODEL_ALIAS}
    litellm_params:
      model: openai/${MODEL}
      api_base: http://vllm:8000/v1
      api_key: dummy

  - model_name: ${MODEL_ALIAS}_api
    litellm_params:
      model: openai/${MODEL}
      api_base: http://vllm:8000/v1
      api_key: dummy
      temperature: 0.3
      top_p: 0.95
      repetition_penalty: 1.05

general_settings:
  user_header_name: X-OpenWebUI-User-Email


vllm startup:

      --model ${MODEL}
      --host 0.0.0.0
      --port 8000
      --download-dir /models
      --gpu-memory-utilization ${MODEL_GPU_UTIL}
      --tensor-parallel-size 1
      --dtype float16
      --kv-cache-dtype ${MODEL_KVCACHE_TYPE}
      --enable-chunked-prefill
      --max-model-len ${MODEL_CONTEXT_SIZE}
      --max-num-seqs ${MODEL_MAX_REQUESTS}
      --max-num-batched-tokens ${MODEL_MAX_TOKENS}
      --trust-remote-code
      --disable-uvicorn-access-log
      --uvicorn-log-level warning
      --enable-auto-tool-choice
      --tool-call-parser ${MODEL_TOOL_TYPE}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Temperature setting not applied correctly when using OpenAI Compatible Provider with LocalLLM URL #7187

App Version

API Provider

Model Used

Roo Code Task Links (Optional)

🔁 Steps to Reproduce

💥 Outcome Summary

Expected Behavior

Actual Behavior

Impact

Proposed Fix

📄 Relevant Logs or Errors (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Temperature setting not applied correctly when using OpenAI Compatible Provider with LocalLLM URL #7187

Description

App Version

API Provider

Model Used

Roo Code Task Links (Optional)

🔁 Steps to Reproduce

💥 Outcome Summary

Expected Behavior

Actual Behavior

Impact

Proposed Fix

📄 Relevant Logs or Errors (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions