# References¶

Adam L. Berger, Stephen A. Della Pietra, Vincent J. Della Pietra, “A Maximum Entropy Approach to Natural Language Processing,” *Computational Linguistics* 22(1): 39-72 (1996). [ACM]

Aurelien Geron, *Hands-On Machine Learning with Scikit-Learn and TensorFlow* (Sebastopol, CA: O’Reilly Media, 2017). [O’Reilly]

Chinmaya Pancholi, “Gensim integration with scikit-learn and Keras,” *Google Summer of Codes* (GSoC) proposal (2017). [Github]

Chinmaya Pancholi, “Chinmaya’s GSoC 2017 Summary: Integration with sklearn & Keras and implementing fastText,” *RaRe Incubator* (September 2, 2017). [RaRe]

Christopher Manning, Hinrich Schütze, *Foundations of Statistical Natural Language Processing* (Cambridge, MA: MIT Press, 1999). [MIT Press]

Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, *Introduction to Information Retrieval* (Cambridge, MA: Cambridge University Press, 2008). [StanfordNLP]

Chunting Zhou, Chonglin Sun, Zhiyuan Liu, Francis Lau, “A C-LSTM Neural Network for Text Classification,” (arXiv:1511.08630). [arXiv]

Daniel E. Russ, Kwan-Yuet Ho, Calvin A. Johnson, Melissa C. Friesen, “Computer-Based Coding of Occupation Codes for Epidemiological Analyses,” *2014 IEEE 27th International Symposium on Computer-Based Medical Systems* (CBMS), pp. 347-350. (2014) [IEEE]

Daniel E. Russ, Kwan-Yuet Ho, Joanne S. Colt, Karla R. Armenti, Dalsu Baris, Wong-Ho Chow, Faith Davis, Alison Johnson, Mark P. Purdue, Margaret R. Karagas, Kendra Schwartz, Molly Schwenn, Debra T. Silverman, Patricia A. Stewart, Calvin A. Johnson, Melissa C. Friesen, “Computer-based coding of free-text job descriptions to efficiently and reliably incorporate occupational risk factors into large-scale epidemiological studies”, *Occup. Environ. Med.* 73, 417-424 (2016). [BMJ]

Daniel Russ, Kwan-yuet Ho, Melissa Friesen, “It Takes a Village To Solve A Problem in Data Science,” Data Science Maryland, presentation at Applied Physics Laboratory (APL), Johns Hopkins University, on June 19, 2017. (2017) [Slideshare]

David H. Wolpert, “Stacked Generalization,” *Neural Netw* 5: 241-259 (1992).

David M. Blei, “Probabilistic Topic Models,” *Communications of the ACM* 55(4): 77-84 (2012). [ACM]

Francois Chollet, “A ten-minute introduction to sequence-to-sequence learning in Keras,” *The Keras Blog*. [Keras]

Francois Chollet, “Building Autoencoders in Keras,” *The Keras Blog*. [Keras]

Hsiang-Fu Yu, Chia-Hua Ho, Yu-Chin Juan, Chih-Jen Lin, “LibShortText: A Library for Short-text Classification.” [NTU]

Ilya Sutskever, James Martens, Geoffrey Hinton, “Generating Text with Recurrent Neural Networks,” *ICML* (2011). [UToronto]

Ilya Sutskever, Oriol Vinyals, Quoc V. Le, “Sequence to Sequence Learning with Neural Networks,” arXiv:1409.3215 (2014). [arXiv]

Jayant Jain, “Implementing Poincaré Embeddings,” RaRe Technologies (2017). [RaRe]

Jeffrey Pennington, Richard Socher, Christopher D. Manning, “GloVe: Global Vectors for Word Representation,” *Empirical Methods in Natural Language Processing (EMNLP)*, pp. 1532-1543 (2014). [PDF]

Keisuke Sakaguchi, Kevin Duh, Matt Post, Benjamin Van Durme, “Robsut Wrod Reocginiton via semi-Character Recurrent Neural Networ,” arXiv:1608.02214 (2016). [arXiv]

“Keras 2.0 Release Notes.” (2017) [Github]

Matt J. Kusner, Yu Sun, Nicholas I. Kolkin, Kilian Q. Weinberger, “From Word Embeddings to Document Distances,” *ICML* (2015).

Maximilian Nickel, Douwe Kiela, “Poincaré Embeddings for Learning Hierarchical Representations,” arXiv:1705.08039 (2017). [arXiv]

Michael Czerny, “Modern Methods for Sentiment Analysis,” *District Data Labs (2015). [DistrictDataLabs]

M. Paz Sesmero, Agapito I. Ledezma, Araceli Sanchis, “Generating ensembles of heterogeneous classifiers using Stacked Generalization,”
*WIREs Data Mining and Knowledge Discovery* 5: 21-34 (2015).

Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom, “A Convolutional Neural Network for Modelling Sentences,” *Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics*, pp. 655-665 (2014). [arXiv]

Oriol Vinyals, Quoc Le, “A Neural Conversational Model,” arXiv:1506.05869 (2015). [arXiv]

Peter Norvig, “How to write a spell corrector.” (2016) [Norvig]

Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov, “Enriching Word Vectors with Subword Information,” arXiv:1607.04606 (2016). [arXiv]

Radim Rehurek, Petr Sojka, “Software Framework for Topic Modelling with Large Corpora,” In Proceedings of LREC 2010 workshop New Challenges for NLP Frameworks (2010). [ResearchGate]

Sebastian Ruder, “An overview of gradient descent optimization algorithms,” blog of Sebastian Ruder, arXiv:1609.04747 (2016). [Ruder or arXiv]

Tal Perry, “Convolutional Methods for Text,” *Medium* (2017). [Medium]

Thomas W. Jones, “textmineR: Functions for Text Mining and Topic Modeling,” CRAN Project. [CRAN or Github]

Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, “Efficient Estimation of Word Representations in Vector Space,” *ICLR* 2013 (2013). [arXiv]

Tom Young, Devamanyu Hazarika, Soujanya Poria, Erik Cambria, “Recent Trends in Deep Learning Based Natural Language Processing,” arXiv:1708.02709 (2017). [arXiv]

Xuan Hieu Phan, Cam-Tu Nguyen, Dieu-Thu Le, Minh Le Nguyen, Susumu Horiguchi, Quang-Thuy Ha,
“A Hidden Topic-Based Framework toward Building Applications with Short Web Documents,”
*IEEE Trans. Knowl. Data Eng.* 23(7): 961-976 (2011).

Xuan Hieu Phan, Le-Minh Nguyen, Susumu Horiguchi, “Learning to Classify Short and Sparse Text & Web withHidden Topics from Large-scale Data Collections,” WWW ‘08 Proceedings of the 17th international conference on World Wide Web. (2008) [ACL]

Yoon Kim, “Convolutional Neural Networks for Sentence Classification,” *EMNLP* 2014, 1746-1751 (arXiv:1408.5882). [arXiv]

Zackary C. Lipton, John Berkowitz, “A Critical Review of Recurrent Neural Networks for Sequence Learning,” arXiv:1506.00019 (2015). [arXiv]

Home: Homepage of shorttext