You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I am a CS student and have been studying DL in his spare time with D2L textbook:)
I'd like to tell you that I really appreciate your work and have really loved the book! Thank you very much.
When I was reading RNN part, I am a bit confused about how we get the partial derivative of hidden state value with respect to w_h.
in 9.7.1 Analysis of Gradients in RNNs,
for formula 9.7.4, we have: $$\frac{\partial h_t}{\partial w_h}= \frac{\partial f(x_{t},h_{t-1},w_h)}{\partial w_h} +\frac{\partial f(x_{t},h_{t-1},w_h)}{\partial h_{t-1}} \frac{\partial h_{t-1}}{\partial w_h}.$$
However, as we have the following from formula 9.7.1 $$h_t:=f(x_t,h_{t-1},w_h).$$,
isn't it $$\frac{\partial h_t}{\partial w_h}-\frac{\partial f(x_{t},h_{t-1},w_h)}{\partial w_h}=0.$$ ?
I was wondering if $$\frac{\partial h_t}{\partial w_h}, \frac{\partial h_{t-1}}{\partial w_h}.$$
should be ordinary deriviates of some function representing h_t that can be parameterized with w_h.
The text was updated successfully, but these errors were encountered:
Hello! I am a CS student and have been studying DL in his spare time with D2L textbook:)
I'd like to tell you that I really appreciate your work and have really loved the book! Thank you very much.
When I was reading RNN part, I am a bit confused about how we get the partial derivative of hidden state value with respect to w_h.
in 9.7.1 Analysis of Gradients in RNNs,
$$\frac{\partial h_t}{\partial w_h}= \frac{\partial f(x_{t},h_{t-1},w_h)}{\partial w_h} +\frac{\partial f(x_{t},h_{t-1},w_h)}{\partial h_{t-1}} \frac{\partial h_{t-1}}{\partial w_h}.$$
for formula 9.7.4, we have:
However, as we have the following from formula 9.7.1
$$h_t:=f(x_t,h_{t-1},w_h).$$ ,
$$\frac{\partial h_t}{\partial w_h}-\frac{\partial f(x_{t},h_{t-1},w_h)}{\partial w_h}=0.$$ ?
$$\frac{\partial h_t}{\partial w_h}, \frac{\partial h_{t-1}}{\partial w_h}.$$
isn't it
I was wondering if
should be ordinary deriviates of some function representing h_t that can be parameterized with w_h.
The text was updated successfully, but these errors were encountered: