<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>GRU on KK's Blog (fromkk)</title><link>https://fromkk.com/tags/gru/</link><description>Recent content in GRU on KK's Blog (fromkk)</description><generator>Hugo</generator><language>en</language><managingEditor>bebound@gmail.com (KK)</managingEditor><webMaster>bebound@gmail.com (KK)</webMaster><lastBuildDate>Sun, 10 Aug 2025 18:44:05 +0800</lastBuildDate><atom:link href="https://fromkk.com/tags/gru/index.xml" rel="self" type="application/rss+xml"/><item><title>LSTM and GRU</title><link>https://fromkk.com/posts/lstm-and-gru/</link><pubDate>Sun, 22 Apr 2018 14:39:00 +0800</pubDate><author>bebound@gmail.com (KK)</author><guid>https://fromkk.com/posts/lstm-and-gru/</guid><description>&lt;h2 id="lstm"&gt;LSTM&lt;/h2&gt;
&lt;p&gt;The avoid the problem of vanishing gradient and exploding gradient in vanilla RNN, LSTM was published, which can remember information for longer periods of time.&lt;/p&gt;
&lt;p&gt;Here is the structure of LSTM:&lt;/p&gt;
&lt;figure class="image-size-s"&gt;&lt;img src="https://fromkk.com/images/LSTM_LSTM.png"&gt;
&lt;/figure&gt;

&lt;p&gt;The calculate procedure are:&lt;/p&gt;
&lt;p&gt;\[\begin{aligned}
f_t&amp;amp;=\sigma(W_f\cdot[h_{t-1},x_t]+b_f)\\
i_t&amp;amp;=\sigma(W_i\cdot[h_{t-1},x_t]+b_i)\\
o_t&amp;amp;=\sigma(W_o\cdot[h_{t-1},x_t]+b_o)\\
\tilde{C_t}&amp;amp;=tanh(W_C\cdot[h_{t-1},x_t]+b_C)\\
C_t&amp;amp;=f_t\ast C_{t-1}+i_t\ast \tilde{C_t}\\
h_t&amp;amp;=o_t \ast tanh(C_t)
\end{aligned}\]&lt;/p&gt;
&lt;p&gt;\(f_t\),\(i_t\),\(o_t\) are forget gate, input gate and output gate respectively. \(\tilde{C_t}\) is the new memory content. \(C_t\) is cell state. \(h_t\) is the output.&lt;/p&gt;</description></item></channel></rss>