Commit c7ab62b
Optimize proto verification
This implements following optimizations
1)Try to avoid conditional move in PushLimit - branchis well predicited
and this allows us to shorten the critical path. ~1-2% improvement
2) Split tags and verify func into 2 separate tables, this saves spaces
(we avoid padding in the table) making it more cache efficient, and makes
tag search potentially vectorizable. ~1-5% improvement.
3) Fully unroll DiscardVarint. This makes it easier for branch predictor
by splitting different branches and allows cpu to speculate past data
dependency since we have clear next p (current + constant) ~4% speed-up
4) Adds a fast path for 1-byte tag + 1-byte varint
5) Replaces switch on rotated value with switch + nested ifs - helps
branch predictor escpcially with fdo and also cuts down critical path,
since msb calculation and switch can be performed in parallel.
6) Restructures the loop by adding inner loop that doesn't call functions
this improves register allocation for the fast loop and doesn't affect
slow cases like messages.
Results:
AMD (milan) is 20% faster:
BM_V1VerifyViewAll/10 3.604µ ± 1% 2.882µ ± 0% -20.04% (p=0.000 n=20)
BM_V1VerifyViewAll/100 3.741µ ± 1% 2.994µ ± 1% -19.97% (p=0.000 n=20)
BM_V1VerifyViewAll/1000 3.798µ ± 1% 3.062µ ± 1% -19.37% (p=0.000 n=20)
BM_V1VerifyCordAll/10 3.688µ ± 0% 2.963µ ± 0% -19.65% (p=0.000 n=20)
BM_V1VerifyCordAll/100 3.837µ ± 1% 3.048µ ± 1% -20.57% (p=0.000 n=20)
BM_V1VerifyCordAll/1000 3.894µ ± 0% 3.152µ ± 0% -19.06% (p=0.000 n=20)
geomean 3.759µ 3.016µ -19.78%
Intel (skylake) is slightly faster, but I think we are running out of cpu width?
BM_V1VerifyViewAll/10 5.002µ ± 1% 4.840µ ± 1% -3.24% (p=0.006 n=20)
BM_V1VerifyViewAll/100 5.068µ ± 2% 4.912µ ± 3% -3.09% (p=0.012 n=20)
BM_V1VerifyViewAll/1000 5.129µ ± 1% 4.954µ ± 1% -3.40% (p=0.000 n=20)
BM_V1VerifyCordAll/10 5.105µ ± 2% 4.937µ ± 1% -3.29% (p=0.004 n=20)
BM_V1VerifyCordAll/100 5.131µ ± 1% 4.999µ ± 5% -2.57% (p=0.035 n=20)
BM_V1VerifyCordAll/1000 5.411µ ± 4% 5.079µ ± 3% -6.13% (p=0.000 n=20)
geomean 5.139µ 4.953µ -3.63%
PiperOrigin-RevId: 9399902081 parent e8c74e1 commit c7ab62b
1 file changed
Lines changed: 12 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
176 | | - | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
177 | 181 | | |
178 | 182 | | |
179 | 183 | | |
| |||
182 | 186 | | |
183 | 187 | | |
184 | 188 | | |
185 | | - | |
| 189 | + | |
| 190 | + | |
186 | 191 | | |
187 | 192 | | |
188 | 193 | | |
189 | | - | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
190 | 199 | | |
191 | 200 | | |
192 | 201 | | |
| |||
0 commit comments