References: [1] Hochreiter, S., Schmidhuber, J. (1997). Long short-term memory. Neural Comput. 9 (8) 1735–1780 (November). https://doi.org/10.1162/neco.1997.9.8.1735, http://dx.doi.org/10.1162/neco.1997.9.8.1735. [2] Mikolov, T., Chen, K., Corrado, G., Dean, J. (2013). Efficient estimation of word representations in vector space. [3] Pennington, J., Socher, R., Manning, C. D. (2014). Glove: Global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP). p. 1532–1543, http://www.aclweb.org/anthology/D14-1162 [4] Schuster, M., Paliwal, K. (1997). Bidirectional recurrent neural networks. Trans. Sig. Proc. 45 (11) 2673–2681 (November).https://doi.org/10.1109/78.650093, http://dx.doi.org/10.1109/78.650093 [5] Wieting, J., Bansal, M., Gimpel, K., Livescu, K. (2015). From paraphrase database to compositional paraphrase model and back. Transactions of the Association for Computational Linguistics 3, 345–358. |