Hi Team,
I have a few questions regarding the model setup and usage on our local machines:
- Would it be possible to customize the batch size to enhance processing speed? Generating a synthesized video currently takes a significant amount of time. If this adjustment is feasible, could you please provide guidance on how to implement it?
- Upon accessing the “Files and Versions” section on Hugging Face, I noticed a large number of files, particularly within the "text_encoder" and "tokenizer" folders. This variety has made it challenging to identify which files are necessary for download. Could you provide some documentation or instructions regarding these files, specifically outlining which ones are essential?
- It would be helpful to include a concrete example in the “Run Inference” section of the GitHub documentation for additional clarity.
Thank you for your assistance!