MoE kernel #206

ByronHsu · 2024-09-04T06:44:40Z

🚀 The feature, motivation and pitch

Currently the most popular library might be https://github.com/databricks/megablocks. Would be interesting if we can implement it in triton and make it HF compatible

Alternatives

No response

Additional context

No response

S1ro1 · 2024-09-05T08:25:47Z

Will do a research on this more, if anyone has any insights on what could/should be implemented, resp. details on to how, cc me.

S1ro1 · 2024-09-05T08:40:59Z

Maybe a preliminary would be to support for example mixtral/nllb_moe from huggingface, to have the integration ready when the layers are done?

yundai424 · 2024-09-05T23:22:25Z

@S1ro1 one straightforward idea is to parallelize expert forward (just like what megablock impl does). Right now in HF model code the MoE block is performed sequentiallyexpert-by-expert. Not sure if it's worth implementing load balancing loss too, haven't seen an actual profiling trace of MoE model training

S1ro1 · 2024-09-05T23:35:24Z

@yundai424 Haven't seen one either, gonna try patching either Mixtral or Nllb with our kernels and profile it, will decide on what to do after that I guess.
Implementing dMoE (dropless MoE) could also be interesting. Will try to send the profiler benchmarks tomorrow and could discuss more in depth. Also I suppose Mixtral > Nllb.

Edit: to address your comment, parallelizing the experts is certainly a low hanging fruit

yundai424 added the feature label Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MoE kernel #206

MoE kernel #206

ByronHsu commented Sep 4, 2024

S1ro1 commented Sep 5, 2024

S1ro1 commented Sep 5, 2024

yundai424 commented Sep 5, 2024 •

edited

Loading

S1ro1 commented Sep 5, 2024 •

edited

Loading

MoE kernel #206

MoE kernel #206

Comments

ByronHsu commented Sep 4, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

S1ro1 commented Sep 5, 2024

S1ro1 commented Sep 5, 2024

yundai424 commented Sep 5, 2024 • edited Loading

S1ro1 commented Sep 5, 2024 • edited Loading

yundai424 commented Sep 5, 2024 •

edited

Loading

S1ro1 commented Sep 5, 2024 •

edited

Loading