Linear Attention Benchmarks

Look at all of these results from different papers, supposedly of the same models on the same benchmark:

What's up with that? This repository contains independent and reproducible benchmarking results for various linear attention mechanisms. This should hopefully help to clarify the discrepancies seen in the literature.

Currently, the repository includes (each on own branch):

Mechanistic Architecture Design (MAD)
Pre-training loss comparison

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
images		images
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Linear Attention Benchmarks

About

Uh oh!

Releases

Packages

HassanJbara/lin-attn-bench

Folders and files

Latest commit

History

Repository files navigation

Linear Attention Benchmarks

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages