Endangered Data Week at Yale

February 26, 2018

The Department of Linguistics will be holding a panel discussion on the topic of reproducible research this Friday at the Sterling Memorial Library. The event is open to all members of the Yale community as part of Endangered Data Week, a series of workshops organized by the Yale School of Medicine’s Cushing/Whitney Medical Library on datasets that are inaccessible or in danger of being lost or suppressed. The panel discussion will also form part of our regular Friday Lunch Talk series.

Most linguistics research projects involve collecting data and analyzing them. Data in linguistics may exist in a variety of forms. For example, a researcher studying the grammar of a particular language may interview native speakers and compile lists of words and sentences, writing down which ones are considered grammatical in the language and which ones are not. On the other hand, a researcher studying how the mind processes language may ask a participant to perform certain tasks and record their reaction times or the movement of their eyes. Reproducibility is the issue of whether or not the datasets analyzed in a research paper are made publicly available after publication. Research is reproducible when other scholars may obtain the datasets, analyze them independently, and arrive at the same conclusion as the original author. At the event on Friday, Professor Claire Bowern will be moderating a discussion with Associate Professor María Piñango, Assistant Professor Jim Wood, and PhD candidate Rikker Dockum on what the best practices should be for creating reproducible research in linguistics.

The panel discussion will be held on Friday, March 2, from 12:00 PM to 1:30 PM at the Sterling Memorial Library’s Lecture Hall. Below is the full schedule for Endangered Data Week. We hope to see you there!

Hosted by the Cushing/Whitney Medical Library

Biomedical Data Repositories Workshop
Monday, 2/26, 4–5pm

So you want to put your research data into a repository. Maybe you anticipate citations and credit from other researchers; maybe you practice open science; maybe data sharing is required by your journal or funder. In this workshop, Research and Education Librarian Kate Nyhan, Access Services/Clinical Librarian Alyssa Grimshaw, and Collection Development & Scholarly Communication Librarian Lindsay Barnett will go over some key questions to consider as you choose the right repository for your project.

  • What are the advantages of domain-specific repositories and interdisciplinary repositories?
  • Can you maintain some control over access and reuse of your data?
  • What features facilitate the discovery, re-use, and citation of your data?

By the end of the workshop, you’ll be able to discuss the pros and cons of data repositories including OSF, figshare, and NCBI (including PubMed Central’s new data deposit options), and you’ll know how to use re3data.org to find disciplinary repositories.

Register for this event here.

What Happens to Community Health When Data is Compromised? A Discussion Panel on the 2020 Census and Other Survey Data
Tuesday, 2/27, 12–1pm, Medical Historical Library

Public health researchers and policy-makers rely on accurate, representative policy data to make informed decisions. This panel of researchers, experts, and activists will discuss how proposed changes in the 2020 Census could discourage participation, jeopardizing access to comprehensive population data. The panelists will explore the potential impacts to community health when essential data is lost or compromised.


  • Mark Abraham, Executive Director of DataHaven
  • Rachel Leventhal-Weiner, Data Engagement Specialist at Connecticut Data Collaborative
  • Kenya Flash, Pol. Sci., Global Affairs & Gov. Info. Librarian at the Center for Science and Social Science Information, Yale University
  • Miriam Olivares, GIS Librarian at the Center for Science and Social Science Information, Yale University
  • Jim Hadler, Senior Consultant, Infectious Disease and Medical Epidemiology, Connecticut and Yale Emerging Infections Program, New York City Department of Health and Mental Hygiene, Council of State and Territorial Epidemiologists

Moderated by Kyle Peyton, PhD Candidate in Political Science, ISPS Policy Fellow

This event is co-sponsored by The Institution for Social and Policy Studies (ISPS) at Yale University.

Data Discussion: Touring the Cushing Center and the Cushing Tumor Registry
Thursday, 3/1, 11am–12pm

“The brains are so cool!” All our visitors say that—but have you heard the story of how this collection came to be, and how researchers are still using these samples today? For Endangered Data Week, we’re offering this special tour exploring how Cushing Tumor Registry has survived a century, and still supports research today.

The Cushing Tumor Registry was endangered when researchers moved institutions, when key staffers retired or died, when funding streams dried up, and when environmental conditions threatened preservation. Could this happen to your project? Join Cushing Center Coordinator Terry Dagradi and Research and Education Librarian Kate Nyhan to discuss the continuing life of this extraordinary (and at one time, endangered) collection.

Register for this tour here.

Working with Census Data
Thursday, 3/1, 4–5pm

The Census Bureau offers rich, longitudinal, geocoded data on health and its social determinants. This workshop will navigate Census.gov to find public-use data releases, technical documentation, and questionnaires for any Census Bureau survey. Join Research and Education Librarian Kate Nyhan and Access Services/Clinical Librarian Alyssa Grimshaw to discuss key concepts for working with census data, including census geographies and the sampling implications of ACS 1-, 3-, and 5-year estimates. You’ll try out American Fact Finder to work with tables and maps, and compare it to licensed mapping tools like SimplyMap, PolicyMap, or SocialExplorer. When you leave the workshop, you’ll be able to leverage this rich public-use data, and you can make an informed decision about which mapping platform is right for you.

Register for this event here.

Hosted by the Institution for Social and Policy Studies

Why Reproducibility in (Social) Science Matters (and How to Get it Right)
Thursday, 3/1, 10:30am–12pm

ISPS Policy Lab, 77 Prospect St.

Talk by Brian Earp (Yale University). This talk will give an overview of the relevant history and philosophy of science with respect to reproducibility, mostly using examples from psychology, and explaining why reproducibility is so important.

Yale co-sponsors: ISPS, Yale Day of Data, Center for Science and Social Science Information, Graduate Writing Lab

Audience: Yale community

Making Research Transparent and Reproducible
Friday, 3/2, 10:30am–12pm

ISPS Policy Lab, 77 Prospect St.

Workshop with Florio Arguillas (Cornell University). The hands-on workshop is intended primarily for postdocs and graduate and undergraduate students in the social sciences. The workshop will focus on practices that help researchers conduct research efficiently and transparently, including how to create replication documentation for research involving statistical data that can help keep everything organized, enhance researchers’ ability to reconstruct the data processing and analysis they do, and be easily shared with others.

Yale co-sponsors: ISPS, Stat Lab, Center for Science and Social Science Information, Yale Center for Research Computing

Audience: Yale postdocs, graduate students, and undergraduate students in social sciences.

Hosted by the Department of Linguistics and Yale University Library

Linguistics Friday Lunch Talk
Friday, 3/2, 12–1:30pm

Sterling Memorial Library Lecture Hall

A panel of Linguistics faculty and graduate students will discuss a position paper on reproducible research in linguistics. The panel will consider the role of reproducibility in increasing verification and accountability; associated implications for how linguistic data are managed, cited, and maintained for long-term access; and mechanisms for evaluating “data work” in academic hiring, tenure, and promotion processes.

Reproducible research in linguistics: A position statement on data citation and attribution in our field


  • Maria Piñango, Associate Professor of Linguistics, Psychology, Interdepartmental Neuroscience Program
  • Jim Wood, Assistant Professor of Linguistics
  • Rikker Dockum, Graduate Student, Linguistics

Moderated by Claire Bowern, Professor of Linguistics

Sponsors: Department of Linguistics, Yale University Library

News Tags: 
People Tags: