^Sutton, Richard S.; Barto, Andrew G. (1998). Reinforcement Learning: An Introduction. MIT Press. ISBN978-0262193986
^Lin, Long-Ji; Mitchell, Tom M. (1993). “Reinforcement Learning with Hidden States”. 2. 271–280
^Onat, Ahmet; Kita, Hajime (1998). “Q-learning with Recurrent Neural Networks as a Controller for the Inverted Pendulum Problem”. The 5th International Conference on Neural Information Processing (ICONIP). pp. 837–840
^Onat, Ahmet; Kita, Hajime (1998). “Recurrent Neural Networks for Reinforcement Learning: Architecture, Learning Algorithms and Internal Representation”. International Joint Conference on Neural Networks (IJCNN). pp. 2010–2015. doi:10.1109/IJCNN.1998.687168
^Shibata, Katsunari (7 March 2017). "Functions that Emerge through End-to-End Reinforcement Learning". arXiv:1703.02239 [cs.AI]。
^Shibata, Katsunari (10 March 2017). "Communications that Emerge through Reinforcement Learning Using a (Recurrent) Neural Network". arXiv:1703.03543 [cs.AI]。
All text is available under the terms of the GNU Free Documentation License. この記事は、ウィキペディアのエンドツーエンドの強化学習 (改訂履歴)の記事を複製、再配布したものにあたり、GNU Free Documentation Licenseというライセンスの下で提供されています。
Weblio辞書に掲載されているウィキペディアの記事も、全てGNU Free Documentation Licenseの元に提供されております。