Reproducible research in linguistics: A position statement on data citation and attribution in our field

Authors: Andrea L. Berez-Kroeker, Lauren Gawne, Susan Smythe Kung, Barbara F. Kelly, Tyler Heston, Gary Holton, Peter Pulsifer, David I. Beaver, Shobhana Chelliah, Stanley Dubinsky, Richard P. Meier, Nick Thieberger, Keren Rice and Anthony C. Woodbury

Abstract: This paper is a position statement on reproducible research in linguistics, including data citation and attribution, that represents the collective views of some 41 colleagues. Reproducibility can play a key role in increasing verification and accountability in linguistic research, and is a hallmark of social science research that is currently under-represented in our field. We believe that we need to take time as a discipline to clearly articulate our expectations for how linguistic data are managed, cited, and maintained for long-term access.

Citation: Berez-Kroeker, A., Gawne, L., Kung, S., et al. (2017). Reproducible research in linguistics: A position statement on data citation and attribution in our field. Linguistics, 56(1), pp. 1-18. Retrieved 16 Apr. 2018, from doi:10.1515/ling-2017-0032


Motivation and Strategies for Implementing Digital Object Identifiers (DOIs) at NCAR’s Earth Observing Laboratory – Past Progress and Future Collaborations

Authors: Janine AquinoJohn AllisonRobert RillingDon StottKathryn Young, Michael Daniels

Abstract: In an effort to lead our community in following modern data citation practices by formally citing data used in published research and implementing standards to facilitate reproducible research results and data, while also producing meaningful metrics that help assess the impact of our services, the National Center for Atmospheric Research (NCAR) Earth Observing Laboratory (EOL) has implemented the use of Digital Object Identifiers (DOIs) (DataCite 2017) for both physical objects (e.g., research platforms and instruments) and datasets. We discuss why this work is important and timely, and review the development of guidelines for the use of DOIs at EOL by focusing on how decisions were made. We discuss progress in assigning DOIs to physical objects and datasets, summarize plans to cite software, describe a current collaboration to develop community tools to display citations on websites, and touch on future plans to cite workflows that document dataset processing and quality control. Finally, we will review the status of efforts to engage our scientific community in the process of using DOIs in their research publications.

Citation: Aquino, J. et al., (2017). Motivation and Strategies for Implementing Digital Object Identifiers (DOIs) at NCAR’s Earth Observing Laboratory – Past Progress and Future Collaborations. Data Science Journal. 16, p.7. DOI:


Source: Data Science Journal

Stop this waste of people, animals and money

Author: David Moher et al

Abstract: Predatory journals are easy to please. They seem to accept papers with little regard for quality, at a fraction of the cost charged by mainstream open-access journals. These supposedly scholarly publishing entities are murky operations, making money by collecting fees while failing to deliver on their claims of being open access and failing to provide services such as peer review and archiving.

Despite abundant evidence that the bar is low, not much is known about who publishes in this shady realm, and what the papers are like. Common wisdom assumes that the hazard of predatory publishing is restricted mainly to the developing world. In one famous sting, a journalist for Science sent a purposely flawed paper to 140 presumed predatory titles (and to a roughly equal number of other open-access titles), pretending to be a biologist based in African capital cities. At least two earlier, smaller surveys found that most authors were in India or elsewhere in Asia. A campaign to warn scholars about predatory journals has concentrated its efforts in Africa, China, India, the Middle East and Russia. Frequent, aggressive solicitations from predatory publishers are generally considered merely a nuisance for scientists from rich countries, not a threat to scholarly integrity.

Our evidence disputes this view. We spent 12 months rigorously characterizing nearly 2,000 biomedical articles from more than 200 journals thought likely to be predatory. More than half of the corresponding authors hailed from high- and upper-middle-income countries as defined by the World Bank.

Citation: Moher, David, et al. “Stop This Waste of People, Animals and Money.” Nature 549, 23–25.


Reproducibility Librarianship

Author: Vicky Steeves

Abstract: Over the past few years, research reproducibility has been increasingly highlighted as a multifaceted challenge across many disciplines. There are socio-cultural obstacles as well as a constantly changing technical landscape that make replicating and reproducing research extremely difficult. Researchers face challenges in reproducing research across different operating systems and different versions of software, to name just a few of the many technical barriers. The prioritization of citation counts and journal prestige has undermined incentives to make research reproducible.

While libraries have been building support around research data management and digital scholarship, reproducibility is an emerging area that has yet to be systematically addressed. To respond to this, New York University (NYU) created the position of Librarian for Research Data Management and Reproducibility (RDM & R), a dual appointment between the Center for Data Science (CDS) and the Division of Libraries. This report will outline the role of the RDM & R librarian, paying close attention to the collaboration between the CDS and Libraries to bring reproducible research practices into the norm.

Citation: Steeves, Vicky. “Reproducibility Librarianship.” Collaborative Librarianship 9, no. 2 (2017): 80-89.


A manifesto for reproducible science

Author: Marcus R. Munafò, Brian A. Nosek, Dorothy V. M. Bishop, Katherine S. Button, Christopher D. Chambers, Nathalie Percie du Sert, Uri Simonsohn, Eric-Jan Wagenmakers, Jennifer J. Ware & John P. A. Ioannidis

Abstract: Improving the reliability and efficiency of scientific research will increase the credibility of the published scientific literature and accelerate discovery. Here we argue for the adoption of measures to optimize key elements of the scientific process: methods, reporting and dissemination, reproducibility, evaluation and incentives. There is some evidence from both simulations and empirical studies supporting the likely effectiveness of these measures, but their broad adoption by researchers, institutions, funders and journals will require iterative evaluation and improvement. We discuss the goals of these measures, and how they can be implemented, in the hope that this will facilitate action toward improving the transparency, reproducibility and efficiency of scientific research.

Citation: Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behaviour 1.


Plans and Performances: Parallels in the Production of Science and Music

Authors: David De Roure, Graham Klyne, Kevin R. Page, John Pybus, David M. Weigl, Matthew Wilcoxson, Pip Willcox

Abstract: Whether in the science lab or the music studio, we go in with a plan, we perform, and we make a record of that performance for distribution, consumption, and reuse. Both domains are increasingly data-intensive, with the adoption of new technology, and also socially intensive with democratised and growing citizen engagement. The music industry has embraced digital technology throughout the lifecycle from composition to consumption; scientific practice, and scholarly communication, are also undergoing transformation. Is the music industry more digital than science? We suggest that comparing and contrasting these two systems will provide insights of mutual benefit. Our investigation explores the notion of the Digital Music Object, analogous to the Research Object, for rich capture, sharing and reuse of both process and content.

Citation: de Roure, D, Klyne, G, Page, KR et al., (2016). Plans and performances: Parallels in the production of science and music.


Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance

Authors: Gil Yolanda, Cedric H. David, Ibrahim Demir, Bakinam T. Essawy, Robinson W. Fulweiler, Jonathan L. Goodall, Leif Karlstrom, Huikyo Lee, Heath J. Mills, Ji-Hyun Oh, Suzanne A. Pierce, Allen Pope, Mimi W. Tzeng, Sandra R. Villamizar, Xuan Yu

Abstract: Geoscientists now live in a world rich with digital data and methods, and their computational research cannot be fully captured in traditional publications. The Geoscience Paper of the Future (GPF) presents an approach to fully document, share, and cite all their research products including data, software, and computational provenance. This article proposes best practices for GPF authors to make data, software, and methods openly accessible, citable, and well documented. The publication of digital objects empowers scientists to manage their research products as valuable scientific assets in an open and transparent way that enables broader access by other scientists, students, decision makers, and the public. Improving documentation and dissemination of research will accelerate the pace of scientific discovery by improving the ability of others to build upon published work.

Citation: Gil, Y., et all (2016). Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance. Earth and Space Science, 3, 388-415. 


Structuring supplemental materials in support of reproducibility

Author: Dov Greenbaum

Abstract: Supplements are increasingly important to the scientific record, particularly in genomics. However, they are often underutilized. Optimally, supplements should make results findable, accessible, interoperable, and reusable (i.e., “FAIR”). Moreover, properly off-loading to them the data and detail in a paper could make the main text more readable. We propose a hierarchical organization for supplements, with some parts paralleling and “shadowing” the main text and other elements branching off from it, and we suggest a specific formatting to make this structure explicit. Furthermore, sections of the supplement could be presented in multiple scientific “dialects”, including machine-readable and lay-friendly formats.

Citation: Greenbaum, Dov, et al., 2017.Structuring supplemental materials in support of reproducibility. Genome Biology 18:64, 10.1186/s13059-017-1205-3.


A Brief History of Archiving in Language Documentation, with an Annotated Bibliography

Authors: Ryan Henke and Andrea L. Berez-Kroeker

Abstract: We survey the history of practices, theories, and trends in archiving for the purposes of language documentation and endangered language conservation. We identify four major periods in the history of such archiving. First, a period from before the time of Boas and Sapir until the early 1990s, in which analog materials were collected and deposited into physical repositories that were not easily accessible to many researchers or speaker communities. A second period began in the 1990s, when increased attention to language endangerment and the development of modern documentary linguistics engendered a renewed and redefined focus on archiving and an embrace of digital technology. A third period took shape in the early twenty-first century, where technological advancements and efforts to develop standards of practice met with important critiques. Finally, in the current period, conversations have arisen toward participatory models for archiving, which break traditional boundaries to expand the audiences and uses for archives while involving speaker communities directly in the archival process. Following the article, we provide an annotated bibliography of 85 publications from the literature surrounding archiving in documentary linguistics. This bibliography contains cornerstone contributions to theory and practice, and it also includes pieces that embody conversations representative of particular historical periods.

Citation: Henke, Ryan and Andrea L. Berez-Kroeker. 2016. A Brief History of Archiving in Language Documentation, with an Annotated Bibliography. Language Documentation & Conservation 10. 411-457.


The Digital Archiving of Endangered Language Oral Traditions: Kaipuleohone at the University of Hawai‘i and C’ek’aedi Hwnax in Alaska

Author: Andrea L. Berez

Abstract: This essay compares and contrasts two small-scale digital endangered language archives with regard to their relevance for oral tradition research. The first is a university-based archive curated at the University of Hawai‘i, which is designed to house endangered language materials arising from the fieldwork of university researchers. The second is an indigenously-administered archive in rural Alaska that serves the language maintenance needs of the Ahtna Athabaskan Alaska Native community.

Citation: Berez, Andrea L. “The Digital Archiving of Endangered Language Oral Traditions: Kaipuleohone at the University of Hawai’i and C’ek’aedi Hwnax in Alaska.” Oral Tradition 28.2 (2013).