Skip to content

Commit 908b17c

Browse files
committed
refactor(README): remove tip styling within folded area
1 parent 00269af commit 908b17c

File tree

1 file changed

+1
-5
lines changed

1 file changed

+1
-5
lines changed

README.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -83,8 +83,6 @@ repoqa.search_needle_function --model "Qwen/CodeQwen1.5-7B-Chat" --backend vllm
8383
<details><summary>🔎 Context extension for small-ctx models <i>:: click to expand ::</i></summary>
8484
<div>
8585

86-
> [!Tip]
87-
>
8886
> There are two ways to unlock a model's context at inference time:
8987
>
9088
> 1. **Direct Extension**: Edit `max_positional_embedding` of the model's `config.json` (e.g., `hub/models--meta-llama--Meta-Llama-3-8B-Instruct/snapshots/[hash]/config.json`) to something like `22528`.
@@ -120,12 +118,10 @@ repoqa.search_needle_function --model "Qwen/CodeQwen1.5-7B-Chat" --backend hf --
120118
<details><summary>🔨 Having trouble installing `flash-attn`? <i>:: click to expand ::</i></summary>
121119
<div>
122120

123-
> [!Tip]
124-
>
125121
> If you have trouble with `pip install flash-attn --no-build-isolation`,
126122
> you can try to directly use [pre-built wheels](https://github.com/Dao-AILab/flash-attention/releases):
127123
>
128-
> ```
124+
> ```shell
129125
> export FLASH_ATTN_VER=2.5.8 # check latest version at https://github.com/Dao-AILab/flash-attention/releases
130126
> export CUDA_VER="cu122" # check available ones at https://github.com/Dao-AILab/flash-attention/releases
131127
> export TORCH_VER=$(python -c "import torch; print('.'.join(torch.__version__.split('.')[:2]))")

0 commit comments

Comments
 (0)