Skip to content

feat: introduce mlp implementation for gated FFN (swiglu, ...)#943

Open
ssmmnn11 wants to merge 11 commits intomainfrom
feat/swiglu_mlp
Open

feat: introduce mlp implementation for gated FFN (swiglu, ...)#943
ssmmnn11 wants to merge 11 commits intomainfrom
feat/swiglu_mlp

Conversation

@ssmmnn11
Copy link
Member

@ssmmnn11 ssmmnn11 commented Mar 2, 2026

Introduce mlp_implementation (mlp, glu, swiglu, geglu, reglu) for Transformer, GraphTransformer, and GNN

breaking change: FFN state_dict key names changed (mlp.* -> mlp.mlp.*),

PointWiseMLPProcessor unchanged

@ssmmnn11 ssmmnn11 added the models label Mar 2, 2026
@ssmmnn11 ssmmnn11 added the ATS Approval Needed Approval needed by ATS label Mar 2, 2026
@github-project-automation github-project-automation bot moved this to To be triaged in Anemoi-dev Mar 2, 2026
@JPXKQX
Copy link
Member

JPXKQX commented Mar 12, 2026

This PR looks good to me. I'm happy to approve it once it has been rebased and the integration tests are passing!

@JPXKQX JPXKQX self-requested a review March 12, 2026 11:05
@mchantry
Copy link
Member

@ssmmnn11 it's a great contribution. It would be valuable to add some scientific results to support this great work.

# Conflicts:
#	models/src/anemoi/models/layers/processor.py
#	models/tests/layers/block/test_block_transformer.py
#	training/src/anemoi/training/train/train.py
# Conflicts:
#	models/src/anemoi/models/layers/mapper.py
#	models/src/anemoi/models/layers/processor.py
#	models/src/anemoi/models/schemas/common_components.py
#	models/tests/layers/block/test_block_graphtransformer.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: To be triaged

Development

Successfully merging this pull request may close these issues.

3 participants