You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use dynamic shapes to handle the past sequence length.
Similar to dynamic batching, we can make two different submodule for each GQA operator for max sequence length and sequence length of 1. Then use a select operator to select the different module based on the inputs sizes(perhaps its possible to reuse the select from dynamic batching).
The text was updated successfully, but these errors were encountered:
There is two parts for this:
select
from dynamic batching).The text was updated successfully, but these errors were encountered: