Skip to content

Commit aad8630

Browse files
committed
Update 2025-10-14 05:59:39
1 parent 1c31ca5 commit aad8630

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+4898
-5503
lines changed

_sources/advanced_features/lora.ipynb

Lines changed: 183 additions & 148 deletions
Large diffs are not rendered by default.

_sources/advanced_features/separate_reasoning.ipynb

Lines changed: 110 additions & 106 deletions
Large diffs are not rendered by default.

_sources/advanced_features/server_arguments.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ Please consult the documentation below and [server_args.py](https://github.com/s
213213
| `--lora-paths` | The list of LoRA adapters to load. Each adapter must be specified in one of the following formats: <PATH> | <NAME>=<PATH> | JSON with schema {"lora_name":str,"lora_path":str,"pinned":bool} | None |
214214
| `--max-loras-per-batch` | Maximum number of adapters for a running batch, include base-only request. | 8 |
215215
| `--max-loaded-loras` | If specified, it limits the maximum number of LoRA adapters loaded in CPU memory at a time. The value must be greater than or equal to `--max-loras-per-batch`. | None |
216+
| `--lora-eviction-policy` | LoRA adapter eviction policy when GPU memory pool is full. `lru`: Least Recently Used (better cache efficiency). `fifo`: First-In-First-Out. | lru |
216217
| `--lora-backend` | Choose the kernel backend for multi-LoRA serving. | triton |
217218
218219
## Kernel backend

_sources/advanced_features/speculative_decoding.ipynb

Lines changed: 269 additions & 451 deletions
Large diffs are not rendered by default.

_sources/advanced_features/structured_outputs.ipynb

Lines changed: 130 additions & 155 deletions
Large diffs are not rendered by default.

_sources/advanced_features/structured_outputs_for_reasoning_models.ipynb

Lines changed: 171 additions & 167 deletions
Large diffs are not rendered by default.

_sources/advanced_features/tool_parser.ipynb

Lines changed: 158 additions & 188 deletions
Large diffs are not rendered by default.

_sources/advanced_features/vlm_query.ipynb

Lines changed: 223 additions & 231 deletions
Large diffs are not rendered by default.

_sources/basic_usage/native_api.ipynb

Lines changed: 214 additions & 215 deletions
Large diffs are not rendered by default.

_sources/basic_usage/offline_engine_api.ipynb

Lines changed: 476 additions & 516 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)