David Smith
Associate Professor

Research interests
- Efficient inference for machine learning models with complex latent structure
- Modeling natural language structures, such as morphology, syntax, and semantics
- Modeling the mutations in texts as they propagate through social networks and in language across space and time
- Interactive information retrieval and machine learning for expert users
Education
- PhD in Computer Science, Johns Hopkins University
- BA in Classics, Harvard University
Biography
David A. Smith is an associate professor in the Khoury College of Computer Sciences at Northeastern University, based in Boston. He is a founding member of the NULab for Texts, Maps, and Networks, Northeastern University’s center for the digital humanities and computational social sciences.
Prior to joining Northeastern, Smith was a professor at the University of Massachusetts and a contributor to Tufts University's Perseus Digital Library, one of the most widely used linguistic and cultural research systems in the humanities field. Funded by the NSF, NEH, DARPA, ONR, AFRL, the Mellon Foundation, and Google, Smith has published widely in natural language processing and computational linguistics, information retrieval, digital libraries, digital humanities, and political science.
Labs and groups
Recent publications
-
Detecting Manuscript Annotations in Historical Print: Negative Evidence and Evaluation Metrics
Citation: Jacob Murel, David A. Smith. (2024). Detecting Manuscript Annotations in Historical Print: Negative Evidence and Evaluation Metrics ICPRAM, 745-752. https://doi.org/10.5220/0012365600003654 -
[TEST-FEB24]-Self-training and Active Learning with Pseudo-relevance Feedback for Handwriting Detection in Historical Print
Citation: Jacob Murel, David A. Smith. (2024). Self-training and Active Learning with Pseudo-relevance Feedback for Handwriting Detection in Historical Print ICDAR (3), 305-324. https://doi.org/10.1007/978-3-031-70543-4_18 -
[TEST-FEB24]-MONSTERMASH: Multidirectional, Overlapping, Nested, Spiral Text Extraction for Recognition Models of Arabic-Script Handwriting
Citation: Danlu Chen, Jacob Murel, Taimoor Shahid, Xiang Zhang, Jonathan Parkes Allen, Taylor Berg-Kirkpatrick, David A. Smith. (2024). MONSTERMASH: Multidirectional, Overlapping, Nested, Spiral Text Extraction for Recognition Models of Arabic-Script Handwriting ICDAR (Workshops 2), 87-101. https://doi.org/10.1007/978-3-031-70642-4_6