Commit 69974ff
Optimize the Cuda Kernel performance of Paddle rms_norm (PaddlePaddle#77098)
* accuracy and Torch alignment
* support rms_norm behavior to be the same as torch
* fix rms_norm_xpu_kernel
* add valueError_test
* Revert "add valueError_test"
This reverts commit ccaaa1b.
* Reapply "add valueError_test"
This reverts commit 19513e8.
* optimize performance
* add vectorization
* fix
* fix dtype of normalized_shape1 parent 1b700c1 commit 69974ff
File tree
12 files changed
+1169
-1147
lines changed- paddle
- fluid/pir/dialect/operator/interface/infer_symbolic_shape
- phi
- infermeta
- kernels
- gpu
- xpu
- ops/yaml
- python/paddle/nn/functional
- test/legacy_test
12 files changed
+1169
-1147
lines changedLines changed: 20 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3428 | 3428 | | |
3429 | 3429 | | |
3430 | 3430 | | |
| 3431 | + | |
| 3432 | + | |
3431 | 3433 | | |
3432 | 3434 | | |
3433 | | - | |
3434 | | - | |
| 3435 | + | |
| 3436 | + | |
| 3437 | + | |
3435 | 3438 | | |
3436 | 3439 | | |
3437 | | - | |
3438 | | - | |
3439 | | - | |
| 3440 | + | |
| 3441 | + | |
| 3442 | + | |
| 3443 | + | |
| 3444 | + | |
| 3445 | + | |
| 3446 | + | |
3440 | 3447 | | |
3441 | 3448 | | |
3442 | 3449 | | |
3443 | 3450 | | |
3444 | | - | |
| 3451 | + | |
| 3452 | + | |
| 3453 | + | |
| 3454 | + | |
| 3455 | + | |
| 3456 | + | |
| 3457 | + | |
| 3458 | + | |
3445 | 3459 | | |
3446 | 3460 | | |
3447 | 3461 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1671 | 1671 | | |
1672 | 1672 | | |
1673 | 1673 | | |
1674 | | - | |
1675 | | - | |
1676 | | - | |
1677 | | - | |
1678 | | - | |
1679 | | - | |
1680 | | - | |
1681 | | - | |
1682 | | - | |
1683 | | - | |
1684 | | - | |
1685 | | - | |
1686 | | - | |
1687 | | - | |
1688 | | - | |
1689 | | - | |
1690 | | - | |
1691 | | - | |
1692 | | - | |
1693 | | - | |
1694 | | - | |
1695 | | - | |
| 1674 | + | |
| 1675 | + | |
| 1676 | + | |
| 1677 | + | |
| 1678 | + | |
| 1679 | + | |
| 1680 | + | |
| 1681 | + | |
| 1682 | + | |
1696 | 1683 | | |
1697 | 1684 | | |
1698 | 1685 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
629 | 629 | | |
630 | 630 | | |
631 | 631 | | |
632 | | - | |
633 | | - | |
634 | | - | |
635 | | - | |
636 | | - | |
637 | | - | |
638 | | - | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
639 | 641 | | |
640 | 642 | | |
641 | 643 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3843 | 3843 | | |
3844 | 3844 | | |
3845 | 3845 | | |
3846 | | - | |
| 3846 | + | |
| 3847 | + | |
3847 | 3848 | | |
3848 | 3849 | | |
3849 | 3850 | | |
3850 | | - | |
| 3851 | + | |
| 3852 | + | |
| 3853 | + | |
| 3854 | + | |
3851 | 3855 | | |
3852 | | - | |
| 3856 | + | |
| 3857 | + | |
| 3858 | + | |
| 3859 | + | |
| 3860 | + | |
| 3861 | + | |
| 3862 | + | |
| 3863 | + | |
| 3864 | + | |
| 3865 | + | |
| 3866 | + | |
| 3867 | + | |
| 3868 | + | |
| 3869 | + | |
| 3870 | + | |
| 3871 | + | |
| 3872 | + | |
| 3873 | + | |
| 3874 | + | |
| 3875 | + | |
| 3876 | + | |
| 3877 | + | |
| 3878 | + | |
| 3879 | + | |
| 3880 | + | |
| 3881 | + | |
3853 | 3882 | | |
3854 | | - | |
3855 | 3883 | | |
3856 | | - | |
3857 | | - | |
| 3884 | + | |
| 3885 | + | |
| 3886 | + | |
3858 | 3887 | | |
3859 | | - | |
3860 | | - | |
3861 | | - | |
3862 | | - | |
| 3888 | + | |
| 3889 | + | |
| 3890 | + | |
| 3891 | + | |
| 3892 | + | |
| 3893 | + | |
| 3894 | + | |
| 3895 | + | |
| 3896 | + | |
| 3897 | + | |
| 3898 | + | |
| 3899 | + | |
| 3900 | + | |
| 3901 | + | |
| 3902 | + | |
3863 | 3903 | | |
3864 | 3904 | | |
3865 | | - | |
3866 | | - | |
3867 | | - | |
3868 | | - | |
3869 | | - | |
3870 | | - | |
3871 | | - | |
3872 | | - | |
3873 | | - | |
3874 | | - | |
3875 | | - | |
| 3905 | + | |
| 3906 | + | |
| 3907 | + | |
3876 | 3908 | | |
3877 | 3909 | | |
3878 | 3910 | | |
| |||
3881 | 3913 | | |
3882 | 3914 | | |
3883 | 3915 | | |
3884 | | - | |
| 3916 | + | |
3885 | 3917 | | |
3886 | | - | |
3887 | | - | |
3888 | | - | |
3889 | | - | |
3890 | | - | |
| 3918 | + | |
| 3919 | + | |
| 3920 | + | |
| 3921 | + | |
| 3922 | + | |
| 3923 | + | |
| 3924 | + | |
| 3925 | + | |
3891 | 3926 | | |
3892 | 3927 | | |
3893 | 3928 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
749 | 749 | | |
750 | 750 | | |
751 | 751 | | |
752 | | - | |
| 752 | + | |
| 753 | + | |
753 | 754 | | |
754 | 755 | | |
755 | 756 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | 14 | | |
| 15 | + | |
19 | 16 | | |
20 | | - | |
21 | 17 | | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | 18 | | |
132 | 19 | | |
133 | 20 | | |
134 | 21 | | |
135 | 22 | | |
136 | 23 | | |
137 | 24 | | |
| 25 | + | |
138 | 26 | | |
139 | 27 | | |
140 | 28 | | |
| |||
143 | 31 | | |
144 | 32 | | |
145 | 33 | | |
| 34 | + | |
146 | 35 | | |
0 commit comments