fix: Fixing issue with first gen token being returned twice in streaming#3427
Conversation
dfaebf0 to
384d556
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #1641 [ run ] triggered by Bot |
|
LGTM. Thank you! |
|
PR_Github #1641 [ run ] completed with state |
df06fa2 to
d90b2b3
Compare
|
/bot run --add-multi-gpu-test |
|
PR_Github #1773 [ run ] triggered by Bot |
…aming Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
d90b2b3 to
45ac435
Compare
|
/bot run --only-multi-gpu-test --disable-fail-fast |
|
PR_Github #1827 [ run ] triggered by Bot |
|
PR_Github #1773 [ run ] completed with state |
|
PR_Github #1827 [ run ] completed with state |
|
/bot run --only-multi-gpu-test --disable-fail-fast |
|
PR_Github #1839 [ run ] triggered by Bot |
|
/bot run --only-multi-gpu-test --disable-fail-fast |
|
PR_Github #1839 [ run ] completed with state |
|
PR_Github #1865 [ run ] triggered by Bot |
|
PR_Github #1865 [ run ] completed with state |
|
/bot run --only-multi-gpu-test |
|
/bot run --add-multi-gpu-test --disable-fail-fast |
|
PR_Github #2050 [ run ] triggered by Bot |
|
PR_Github #2050 [ run ] completed with state |
|
/bot run --stage-list "L40S-TensorRT-3" |
|
/bot run --only-multi-gpu-test |
|
PR_Github #2059 [ run ] triggered by Bot |
|
PR_Github #2060 [ run ] triggered by Bot |
|
PR_Github #2059 [ run ] completed with state |
|
/bot run --only-multi-gpu-test --disable-fail-fast |
|
PR_Github #2061 [ run ] triggered by Bot |
|
PR_Github #2060 [ run ] completed with state |
|
PR_Github #2061 [ run ] completed with state |
|
/bot run --stage-list "L40S-TensorRT-3" |
|
PR_Github #2070 [ run ] triggered by Bot |
|
PR_Github #2070 [ run ] completed with state |
|
/bot reuse-pipeline |
|
PR_Github #2084 [ reuse-pipeline ] triggered by Bot |
|
PR_Github #2084 [ reuse-pipeline ] completed with state |
|
/bot skip --comment "ran all tests previously" |
|
PR_Github #2087 [ skip ] triggered by Bot |
|
PR_Github #2087 [ skip ] completed with state |
|
/bot skip --comment "Ran all tests previously" |
|
PR_Github #2092 [ skip ] triggered by Bot |
|
PR_Github #2092 [ skip ] completed with state |
…ing (NVIDIA#3427) * fix: Fixing issue with first gen token being returned twice with streaming Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> * Fixing not_expectring_strings in test Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> --------- Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
Better fix for first gen token being returned twice in streaming mode.