Thank you for sharing this work! I’ve tried replicating the training process using the documented settings (hyperparameters, dataset, etc.), but the model consistently fails to achieve satisfactory performance even after multiple attempts. Could you confirm if there are any undocumented steps (e.g., data augmentation, initialization, or code adjustments) critical for successful training? Any guidance would be greatly appreciated!
Thank you for sharing this work! I’ve tried replicating the training process using the documented settings (hyperparameters, dataset, etc.), but the model consistently fails to achieve satisfactory performance even after multiple attempts. Could you confirm if there are any undocumented steps (e.g., data augmentation, initialization, or code adjustments) critical for successful training? Any guidance would be greatly appreciated!