Stronger Random Baselines for In-Context Learning
Gregory Yauney and David Mimno
COLM 2024
A Pretrainer’s Guide to Training Data:
Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Shayne Longpre, Gregory Yauney,
Emily Reif,
Katherine Lee,
Adam Roberts,
Barret Zoph,
Denny Zhou,
Jason Wei,
Kevin Robinson, David Mimno, and
Daphne Ippolito
NAACL 2024
Outstanding Paper Award
The Afterlives of Shakespeare and Company in Online Social Readership
Journal of Cultural Analytics 2024
Data Similarity is Not Enough to Explain Language Model Performance
Gregory Yauney, Emily Reif, and David Mimno
EMNLP 2023
Probing Heterogeneous Pretraining Datasets with Small Curated Datasets
Gregory Yauney, Emily Reif, and David Mimno
Data-Centric Machine Learning Research Workshop at ICML 2023
Comparing Text Representations: A Theory-Driven Approach
Gregory Yauney and David Mimno
EMNLP 2021
Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
EMNLP 2020
Network Analysis Finds Shifts in the History of Modern Architecture
Gregory Yauney and David Mimno
Poster at Digital Humanities 2020
Combatting the Challenges of Local Privacy for Distributional Semantics with Compression
Privacy in Machine Learning Workshop at NeurIPS 2019
Paper
Poster
Computational Prediction of Elapsed Narrative Time
Workshop on Narrative Understanding at NAACL 2019
Paper
Poster