Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MoE kernel #206

Open
ByronHsu opened this issue Sep 4, 2024 · 4 comments
Open

MoE kernel #206

ByronHsu opened this issue Sep 4, 2024 · 4 comments
Labels

Comments

@ByronHsu
Copy link
Collaborator

ByronHsu commented Sep 4, 2024

🚀 The feature, motivation and pitch

Currently the most popular library might be https://github.com/databricks/megablocks. Would be interesting if we can implement it in triton and make it HF compatible

Alternatives

No response

Additional context

No response

@S1ro1
Copy link
Contributor

S1ro1 commented Sep 5, 2024

Will do a research on this more, if anyone has any insights on what could/should be implemented, resp. details on to how, cc me.

@S1ro1
Copy link
Contributor

S1ro1 commented Sep 5, 2024

Maybe a preliminary would be to support for example mixtral/nllb_moe from huggingface, to have the integration ready when the layers are done?

@yundai424
Copy link
Collaborator

yundai424 commented Sep 5, 2024

@S1ro1 one straightforward idea is to parallelize expert forward (just like what megablock impl does). Right now in HF model code the MoE block is performed sequentiallyexpert-by-expert. Not sure if it's worth implementing load balancing loss too, haven't seen an actual profiling trace of MoE model training

@S1ro1
Copy link
Contributor

S1ro1 commented Sep 5, 2024

@yundai424 Haven't seen one either, gonna try patching either Mixtral or Nllb with our kernels and profile it, will decide on what to do after that I guess.
Implementing dMoE (dropless MoE) could also be interesting. Will try to send the profiler benchmarks tomorrow and could discuss more in depth. Also I suppose Mixtral > Nllb.

Edit: to address your comment, parallelizing the experts is certainly a low hanging fruit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants