Skip to content

Moves LayerNorm to output of the Encoder's sub-layers#47

Open
patrickgadd wants to merge 1 commit intoallegro:masterfrom
patrickgadd:master
Open

Moves LayerNorm to output of the Encoder's sub-layers#47
patrickgadd wants to merge 1 commit intoallegro:masterfrom
patrickgadd:master

Conversation

@patrickgadd
Copy link

Hi there,

As explained in #46, I believe that there is a minor lack of correspondence between the implementation and what's written in the paper ("Context-Aware Learning to Rank with Self-Attention") when it comes to the Transformer architecture.

Sadly I can't say whether this in practice affects performance, as I'm attempting to utilize this work for something entirely different.
However, it looks like that with the fix, learning is a tad more stable.

At any rate, thank you once again for this work and publishing it!

@PrzemekPobrotyn
Copy link
Contributor

please see my response in #46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants