Skip to content

Commit bef9e5a

Browse files
author
corey-lambda
authored
Update page.md
1 parent 23a30bf commit bef9e5a

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

page.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Quick Jump:
5353
tl;dr:
5454

5555
1. `model size * 0.5`
56-
2. `throughput * 1.2ish` (with a lot of caveats)
56+
2. `throughput * 1.2ish` (with a lot of caveats). See [our benchmarks](https://docs.google.com/spreadsheets/d/1W5KrY3fv0yPJCt8RU3EBap6_K13VY9-oURkoH6zX2sM/edit?usp=sharing)
5757

5858
Models today are usually trained in `bf16`, which is a decimal number stored in 16 bits (2 bytes). At the billions of parameter scale, these add up VERY quickly. The main reason for quantizing a model from `bf16` to `fp8` is **memory reduction.**
5959

0 commit comments

Comments
 (0)