Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoLigerKernelForCausalLM.from_pretrained discards hub_kwargs_names #250

Open
uris-opti opened this issue Sep 16, 2024 · 3 comments
Open
Assignees

Comments

@uris-opti
Copy link

🐛 Describe the bug

AutoLigerKernelForCausalLM.from_pretrained retain only the keyword args present in the model configuration, which do not include hub_kwargs_names -> [
"cache_dir",
"force_download",
"local_files_only",
"proxies",
"resume_download",
"revision",
"subfolder",
"use_auth_token",
"token",
]
therefore cannot replace AutoModelForCausalLM

Reproduce

N/A

Versions

N/A

@tyler-romero
Copy link
Contributor

+1, I notice that some models don't have attn_implementation in their config even though its a valid keyword arg to AutoModelForCausalLM - so the user-specified attn_implementation gets discarded as well.

@shimizust shimizust self-assigned this Sep 16, 2024
@shimizust
Copy link
Collaborator

Thanks for reporting! Currently we are only keeping kwargs that are present in the model config. I wasn't aware of this other set of valid args--let me look into it

@uris-opti
Copy link
Author

Thanks for looking into it!
I don't know what's the motivation for filtering the kwargs, but I would consider removing this logic completely,
After seeing Tyler's comment, I looked around the transformers code and noticed there are many many kwargs that are not in the config and they vary between from_pretrained implementations,
I also saw that each from_pretrained implementation handles extra kwargs, so just passing on the kwargs seems safe
🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants