Gregory Yauney
I'm a sixth-year CS PhD candidate at Cornell, advised by David Mimno. He is great.
I am interested in the intersection of NLP and theoretical machine learning, and digital humanities.
I also like photography!
CVGoogle ScholardblpGitHubInstagramTwitter
A Pretrainer’s Guide to Training Data:
Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Shayne Longpre, Gregory Yauney, Emily Reif, Katherine Lee, Adam Roberts,
Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David Mimno, and Daphne Ippolito
NAACL 2024
Paper

Data Similarity is Not Enough to Explain Language Model Performance
Gregory Yauney, Emily Reif, and David Mimno
EMNLP 2023
Paper  Code  Poster  Reviews

The Afterlives of Shakespeare and Company in Online Social Readership
Maria Antoniak, David Mimno, Rosamond Thalken, Melanie Walsh, Matthew Wilkens, Gregory Yauney
arXiv 2024
Paper

Probing Heterogeneous Pretraining Datasets with Small Curated Datasets
Gregory Yauney, Emily Reif, and David Mimno
Data-Centric Machine Learning Research Workshop at ICML 2023
Paper  Poster

Comparing Text Representations: A Theory-Driven Approach
Gregory Yauney and David Mimno
EMNLP 2021
Paper  Code  Poster  Blog

Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Gregory Yauney, Jack Hessel, and David Mimno
EMNLP 2020
Paper  Code  Talk

Network Analysis Finds Shifts in the History of Modern Architecture
Gregory Yauney and David Mimno
Poster at Digital Humanities 2020
Abstract  Poster  Code  Data

Combatting the Challenges of Local Privacy for Distributional Semantics with Compression
Alexandra Schofield, Gregory Yauney, and David Mimno
Privacy in Machine Learning Workshop at NeurIPS 2019
Paper  Poster

Computational Prediction of Elapsed Narrative Time
Gregory Yauney, Ted Underwood, and David Mimno
Workshop on Narrative Understanding at NAACL 2019
Paper  Poster