Hi,
Thank you for your excellent work!
May I ask if colpali_engine supports gradient accumulation? I would like to increase the effective batch size by accumulating gradients, but I'm not sure if this might disrupt the loss calculation.
Looking forward to your reply.