本文共 501 字,大约阅读时间需要 1 分钟。
Long Short-term memory is a kind of recurrent neural networks. It has additional gates and its' architecture is bellow:
This architecture can avoid the gradient vanishing and keep the long-term memory, so this is a very powerful neural network.
We can get these formulations:
The symbol denotes the indexs of all units, denotes the output of unit .
First of all, we define the truncated item:
In the later derivation, we may use the to denote for conventional.
Secondly, we will derive the derivative.