Commit f4ff803
authored
Adding Compute-Context-Length (CCL) (#576)
Compute-Context-Length (CCL) technique optimizes the throughput of large
language models (LLMs) on Qualcomm devices when handling very large
context lengths. The current Ahead Of Time (AOT) compilation on Qualcomm
devices doesn't predict the number of tokens needed, leading to
significant throughput drops during the prefilling and the decoding
phases. This happens because the system performs attention calculations
based on large context length. To address this issue, we introduce
Compute Context Length (CCL), an additional ONNX variable that allows
for dynamic context-length specialization. By generating tokens using
smaller, more manageable context lengths (CCL), we optimize memory reads
and attention calculations, thereby improving throughput.
---------
Signed-off-by: Vahid Janfaza <[email protected]>1 parent c788f17 commit f4ff803
File tree
56 files changed
+3189
-304
lines changed- QEfficient
- cloud
- customop
- generation
- peft/lora
- transformers
- models
- codegen
- falcon
- gemma2
- gemma3
- gemma
- gpt2
- gpt_bigcode
- gpt_oss
- gptj
- granitemoe
- granite
- grok_1
- internvl
- llama4
- llama_swiftkv
- llama
- llava_next
- llava
- mistral3
- mistral
- mixtral_moe
- mllama
- molmo
- mpt
- olmo2
- phi3
- phi
- qwen2_5_vl
- qwen2
- qwen3_moe
- qwen3
- starcoder2
- whisper
- utils
- examples
- gemma3_example
- granite_example
- intern_example
- qwen3moe_example
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
56 files changed
+3189
-304
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
340 | 340 | | |
341 | 341 | | |
342 | 342 | | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
343 | 355 | | |
344 | 356 | | |
345 | 357 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | | - | |
119 | | - | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
120 | 126 | | |
121 | 127 | | |
122 | 128 | | |
| |||
127 | 133 | | |
128 | 134 | | |
129 | 135 | | |
130 | | - | |
| 136 | + | |
131 | 137 | | |
132 | 138 | | |
133 | 139 | | |
| |||
137 | 143 | | |
138 | 144 | | |
139 | 145 | | |
140 | | - | |
141 | | - | |
| 146 | + | |
| 147 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
| 100 | + | |
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
104 | | - | |
| 104 | + | |
| 105 | + | |
105 | 106 | | |
106 | 107 | | |
107 | 108 | | |
108 | 109 | | |
109 | | - | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
110 | 114 | | |
111 | 115 | | |
112 | 116 | | |
| |||
119 | 123 | | |
120 | 124 | | |
121 | 125 | | |
122 | | - | |
| 126 | + | |
123 | 127 | | |
124 | 128 | | |
125 | 129 | | |
| |||
129 | 133 | | |
130 | 134 | | |
131 | 135 | | |
132 | | - | |
133 | | - | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
134 | 140 | | |
135 | 141 | | |
136 | 142 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
318 | 318 | | |
319 | 319 | | |
320 | 320 | | |
| 321 | + | |
| 322 | + | |
321 | 323 | | |
322 | 324 | | |
323 | 325 | | |
| |||
384 | 386 | | |
385 | 387 | | |
386 | 388 | | |
| 389 | + | |
| 390 | + | |
387 | 391 | | |
388 | 392 | | |
389 | 393 | | |
| |||
430 | 434 | | |
431 | 435 | | |
432 | 436 | | |
| 437 | + | |
| 438 | + | |
433 | 439 | | |
434 | 440 | | |
435 | 441 | | |
| |||
440 | 446 | | |
441 | 447 | | |
442 | 448 | | |
| 449 | + | |
| 450 | + | |
443 | 451 | | |
444 | 452 | | |
445 | 453 | | |
| |||
802 | 810 | | |
803 | 811 | | |
804 | 812 | | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
805 | 818 | | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
806 | 824 | | |
807 | 825 | | |
808 | 826 | | |
| |||
822 | 840 | | |
823 | 841 | | |
824 | 842 | | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
825 | 856 | | |
826 | 857 | | |
827 | 858 | | |
| |||
853 | 884 | | |
854 | 885 | | |
855 | 886 | | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
856 | 891 | | |
857 | 892 | | |
858 | 893 | | |
| |||
890 | 925 | | |
891 | 926 | | |
892 | 927 | | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
893 | 942 | | |
894 | 943 | | |
895 | 944 | | |
| |||
902 | 951 | | |
903 | 952 | | |
904 | 953 | | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
905 | 963 | | |
906 | 964 | | |
907 | 965 | | |
| |||
928 | 986 | | |
929 | 987 | | |
930 | 988 | | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
931 | 995 | | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
932 | 1001 | | |
933 | 1002 | | |
934 | 1003 | | |
| |||
940 | 1009 | | |
941 | 1010 | | |
942 | 1011 | | |
| 1012 | + | |
943 | 1013 | | |
944 | 1014 | | |
945 | 1015 | | |
| |||
989 | 1059 | | |
990 | 1060 | | |
991 | 1061 | | |
| 1062 | + | |
| 1063 | + | |
992 | 1064 | | |
993 | 1065 | | |
994 | 1066 | | |
| |||
1002 | 1074 | | |
1003 | 1075 | | |
1004 | 1076 | | |
| 1077 | + | |
| 1078 | + | |
1005 | 1079 | | |
1006 | 1080 | | |
1007 | 1081 | | |
| |||
1013 | 1087 | | |
1014 | 1088 | | |
1015 | 1089 | | |
| 1090 | + | |
| 1091 | + | |
1016 | 1092 | | |
1017 | 1093 | | |
1018 | 1094 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
83 | 83 | | |
84 | 84 | | |
85 | 85 | | |
| 86 | + | |
| 87 | + | |
86 | 88 | | |
87 | 89 | | |
88 | 90 | | |
| |||
123 | 125 | | |
124 | 126 | | |
125 | 127 | | |
| 128 | + | |
| 129 | + | |
126 | 130 | | |
127 | 131 | | |
128 | 132 | | |
| |||
294 | 298 | | |
295 | 299 | | |
296 | 300 | | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
297 | 306 | | |
298 | 307 | | |
299 | 308 | | |
| |||
312 | 321 | | |
313 | 322 | | |
314 | 323 | | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
315 | 331 | | |
316 | 332 | | |
317 | 333 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
| 45 | + | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | | - | |
| 53 | + | |
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| |||
0 commit comments