improving of position embedding #2223

pass-lin · 2025-04-22T07:50:30Z

Removed the submission about the export of the repair roformer

pass-lin · 2025-04-24T06:22:16Z

ruff.....................................................................Passed
ruff-format..............................................................Passed
Error: Process completed with exit code 1.

What is the cause of this error? I need help
@madhusshivakumar @divyashreepathihalli

mattdangerw · 2025-04-25T01:32:46Z

@pass-lin try just rebase/syncing to latest master. Some tooling changes just went out here. #2222

mattdangerw

Thanks! Just a question for now. What models use this?

mattdangerw · 2025-04-25T01:35:28Z

keras_hub/src/layers/modeling/position_embedding.py

@@ -61,6 +61,7 @@ def __init__(
        self,
        sequence_length,
        initializer="glorot_uniform",
+        hierarchical_alpha=0.4,


We'd need a docstring to add this. But before we do, is this used by more than one model architecture? If this is just used in one, we probably would prefer to keep the layer simple and just leave the customization per model arch.

pass-lin · 2025-04-25T03:52:11Z

Thanks! Just a question for now. What models use this?

The main function of this feature is to non-destructively expand the max sequence of models such as BERT and ViT.
For example, in BERT, this function will only be activated when the sequence length is greater than 512.

pass-lin · 2025-04-26T12:53:33Z

@mattdangerw
At present, the following Keras_hub models can probably benefit from this feature.
bert roberta albert gpt2
Also, in the future, I plan to add ESM2.
The current maximum length of esm2 is only 1024, which is far from enough in bioinformatics.

mattdangerw · 2025-04-28T20:13:08Z

@pass-lin how would you expect to use this feature with BERT currently? Seems tricky to actual use given that bert will be loaded form a config without this option.

pass-lin · 2025-04-29T04:35:04Z

@pass-lin how would you expect to use this feature with BERT currently? Seems tricky to actual use given that bert will be loaded form a config without this option.

I hope it can be used without feeling anything. It can be activated when the length exceeds the preset maximum length.

If you think this is unnecessary, you can close this PR.

pass-lin added 2 commits April 22, 2025 15:48

improving of position embedding

0d213a7

format

2b6af60

mattdangerw reviewed Apr 25, 2025

View reviewed changes

pass-lin and others added 3 commits April 26, 2025 00:33

Merge branch 'keras-team:master' into master

e3afe3f

add argument

2be213e

add argument

b15c674

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improving of position embedding #2223

improving of position embedding #2223

pass-lin commented Apr 22, 2025

pass-lin commented Apr 24, 2025

mattdangerw commented Apr 25, 2025

mattdangerw left a comment

mattdangerw Apr 25, 2025

pass-lin commented Apr 25, 2025

pass-lin commented Apr 26, 2025 •

edited

Loading

mattdangerw commented Apr 28, 2025

pass-lin commented Apr 29, 2025

improving of position embedding #2223

Are you sure you want to change the base?

improving of position embedding #2223

Conversation

pass-lin commented Apr 22, 2025

pass-lin commented Apr 24, 2025

mattdangerw commented Apr 25, 2025

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw Apr 25, 2025

Choose a reason for hiding this comment

pass-lin commented Apr 25, 2025

pass-lin commented Apr 26, 2025 • edited Loading

mattdangerw commented Apr 28, 2025

pass-lin commented Apr 29, 2025

pass-lin commented Apr 26, 2025 •

edited

Loading