Commit d7cc5d7
[ET-VK] Fix embedding_q4gsw out-of-bounds access with dynamic shapes
The embedding_q4gsw shader used push constants for num_indices,
out_height, and embed_dim that were captured at graph build time and
never updated when input tensors were dynamically resized. This caused
out-of-bounds GPU memory reads when the actual input was smaller than
the initial allocation, resulting in VK_ERROR_DEVICE_LOST on Mali GPUs.
The fix derives all shape-dependent values (embed_dim, out_height,
num_indices) from the output tensor's sizes UBO, which is automatically
updated on resize. Only truly constant values (group_size,
is_linear_weight) remain as push constants.
Root cause: With a 7-token input on a graph built for 256 tokens, the
local workgroup rounding created an extra thread (y=7) that passed the
stale bounds check (7 >= 256 == false) and read past the 7-element
indices buffer.
Differential Revision: [D98642319](https://our.internmc.facebook.com/intern/diff/D98642319/)
ghstack-source-id: 359350851
Pull Request resolved: #185581 parent def3699 commit d7cc5d7
2 files changed
Lines changed: 22 additions & 26 deletions
Lines changed: 11 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
48 | 51 | | |
49 | 52 | | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
| 69 | + | |
69 | 70 | | |
70 | 71 | | |
71 | 72 | | |
| |||
96 | 97 | | |
97 | 98 | | |
98 | 99 | | |
| 100 | + | |
99 | 101 | | |
100 | 102 | | |
101 | 103 | | |
| |||
124 | 126 | | |
125 | 127 | | |
126 | 128 | | |
| 129 | + | |
| 130 | + | |
127 | 131 | | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
128 | 135 | | |
129 | 136 | | |
130 | 137 | | |
| |||
147 | 154 | | |
148 | 155 | | |
149 | 156 | | |
150 | | - | |
| 157 | + | |
151 | 158 | | |
152 | 159 | | |
153 | 160 | | |
| |||
Lines changed: 11 additions & 22 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | 67 | | |
71 | | - | |
| 68 | + | |
| 69 | + | |
72 | 70 | | |
73 | 71 | | |
74 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| |||
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | 94 | | |
98 | 95 | | |
99 | 96 | | |
100 | | - | |
| 97 | + | |
101 | 98 | | |
102 | 99 | | |
103 | 100 | | |
104 | 101 | | |
105 | 102 | | |
106 | 103 | | |
107 | 104 | | |
108 | | - | |
| 105 | + | |
109 | 106 | | |
110 | 107 | | |
111 | 108 | | |
| |||
125 | 122 | | |
126 | 123 | | |
127 | 124 | | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
| 125 | + | |
| 126 | + | |
136 | 127 | | |
137 | 128 | | |
138 | 129 | | |
| |||
152 | 143 | | |
153 | 144 | | |
154 | 145 | | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | 146 | | |
159 | | - | |
| 147 | + | |
| 148 | + | |
160 | 149 | | |
161 | 150 | | |
162 | 151 | | |
| |||
0 commit comments