Commit c0b18d9
[Kernel] AQ AZP 4/4: Integrate asymmetric quantization to linear method (vllm-project#7271)
Signed-off-by: Alvant <[email protected]>1 parent 55ea39f commit c0b18d9
File tree
7 files changed
+124
-21
lines changed- .buildkite/lm-eval-harness
- configs
- tests/quantization
- vllm/model_executor/layers/quantization
- compressed_tensors
- schemes
- utils
7 files changed
+124
-21
lines changedLines changed: 11 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
57 | 58 | | |
58 | | - | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| |||
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
23 | 26 | | |
24 | | - | |
| 27 | + | |
25 | 28 | | |
26 | 29 | | |
27 | 30 | | |
| |||
31 | 34 | | |
32 | 35 | | |
33 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
34 | 49 | | |
35 | 50 | | |
36 | 51 | | |
| |||
69 | 84 | | |
70 | 85 | | |
71 | 86 | | |
| 87 | + | |
72 | 88 | | |
| 89 | + | |
| 90 | + | |
73 | 91 | | |
74 | | - | |
| 92 | + | |
75 | 93 | | |
76 | 94 | | |
77 | 95 | | |
| |||
160 | 178 | | |
161 | 179 | | |
162 | 180 | | |
163 | | - | |
| 181 | + | |
Lines changed: 10 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
141 | | - | |
142 | 141 | | |
143 | 142 | | |
144 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
145 | 146 | | |
146 | 147 | | |
147 | 148 | | |
| |||
151 | 152 | | |
152 | 153 | | |
153 | 154 | | |
154 | | - | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
158 | 160 | | |
159 | 161 | | |
160 | 162 | | |
| |||
265 | 267 | | |
266 | 268 | | |
267 | 269 | | |
268 | | - | |
| 270 | + | |
| 271 | + | |
269 | 272 | | |
270 | 273 | | |
271 | 274 | | |
272 | 275 | | |
273 | | - | |
| 276 | + | |
| 277 | + | |
274 | 278 | | |
275 | 279 | | |
276 | 280 | | |
| |||
vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w8a8_int8.py
Lines changed: 52 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| 18 | + | |
| 19 | + | |
17 | 20 | | |
18 | 21 | | |
19 | 22 | | |
20 | | - | |
| 23 | + | |
| 24 | + | |
21 | 25 | | |
22 | 26 | | |
| 27 | + | |
23 | 28 | | |
24 | 29 | | |
25 | 30 | | |
| |||
46 | 51 | | |
47 | 52 | | |
48 | 53 | | |
49 | | - | |
50 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
51 | 76 | | |
52 | 77 | | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
53 | 91 | | |
54 | 92 | | |
55 | 93 | | |
| |||
90 | 128 | | |
91 | 129 | | |
92 | 130 | | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
93 | 140 | | |
94 | 141 | | |
95 | 142 | | |
96 | 143 | | |
97 | 144 | | |
98 | 145 | | |
99 | 146 | | |
| 147 | + | |
| 148 | + | |
100 | 149 | | |
Lines changed: 17 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
191 | 191 | | |
192 | 192 | | |
193 | 193 | | |
| 194 | + | |
| 195 | + | |
194 | 196 | | |
195 | 197 | | |
196 | 198 | | |
197 | 199 | | |
198 | 200 | | |
199 | | - | |
200 | | - | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
201 | 216 | | |
202 | 217 | | |
203 | 218 | | |
| |||
0 commit comments