The Radical Collaboration of RDA and What It Means for Developing Institutional Data Management Services

Author: Amy Nurnberger

Abstract: The Research Data Alliance (RDA) is an organization dedicated to
reducing barriers to data sharing and exchange. While there are many
technical barriers that must still be surmounted, it is a core principle of
RDA that technical impediments are not the only ones. Often the more
challenging barriers are the less visible social roadblocks and those
blockades constructed at the intersections of the technical and the
social. In my experience in developing and working in institutional data
management services, these services are also dedicated to easing the
way to data sharing and are likewise subject to a similar set of barriers.
The connections between how RDA works, how data management
services develop in institutions, and how radical collaboration happens
may map out a route to more successful service development practices.

Citation: Amy Nurnberger. “The Radical Collaboration of RDA and What It Means for Developing Institutional Data Management Services.” Research Library Issues, no. 296 (2018): 23–32.



Source: Research Library Issues

Motivation and Strategies for Implementing Digital Object Identifiers (DOIs) at NCAR’s Earth Observing Laboratory – Past Progress and Future Collaborations

Authors: Janine AquinoJohn AllisonRobert RillingDon StottKathryn Young, Michael Daniels

Abstract: In an effort to lead our community in following modern data citation practices by formally citing data used in published research and implementing standards to facilitate reproducible research results and data, while also producing meaningful metrics that help assess the impact of our services, the National Center for Atmospheric Research (NCAR) Earth Observing Laboratory (EOL) has implemented the use of Digital Object Identifiers (DOIs) (DataCite 2017) for both physical objects (e.g., research platforms and instruments) and datasets. We discuss why this work is important and timely, and review the development of guidelines for the use of DOIs at EOL by focusing on how decisions were made. We discuss progress in assigning DOIs to physical objects and datasets, summarize plans to cite software, describe a current collaboration to develop community tools to display citations on websites, and touch on future plans to cite workflows that document dataset processing and quality control. Finally, we will review the status of efforts to engage our scientific community in the process of using DOIs in their research publications.

Citation: Aquino, J. et al., (2017). Motivation and Strategies for Implementing Digital Object Identifiers (DOIs) at NCAR’s Earth Observing Laboratory – Past Progress and Future Collaborations. Data Science Journal. 16, p.7. DOI:


Source: Data Science Journal

Data We Trust—But What Data?

Author: Jennifer Golbeck

Abstract: The Obama administration’s time saw massive amounts of government data shifting online. It can be hard to remember the landscape back in 2008, when very few people had smartphones, and Facebook had fewer than 150 million users—less than 10 percent of its current size.1 We were just starting to grapple with all the data that was becoming available. The administration embraced the trend. They launched, a project designed to serve as a repository of important data sets from the federal government. Agencies followed suit, uploading their data or creating their own repositories. Databases, websites, and all sorts of content became accessible online. It appeared we were entering a golden age of open data, where citizens would have access to the raw data that their tax dollars funded, that fueled policy decisions, and that affected their lives. The movement of government data to the web improved transparency and fueled research to complement official sources.

Citation: Golbeck, Jennifer. “Data We Trust—But What Data?” Reference & User Services Quarterly 57, no. 3 (March 16, 2018): 196–99.


Source: Data We Trust—But What Data?

Reproducibility Librarianship

Author: Vicky Steeves

Abstract: Over the past few years, research reproducibility has been increasingly highlighted as a multifaceted challenge across many disciplines. There are socio-cultural obstacles as well as a constantly changing technical landscape that make replicating and reproducing research extremely difficult. Researchers face challenges in reproducing research across different operating systems and different versions of software, to name just a few of the many technical barriers. The prioritization of citation counts and journal prestige has undermined incentives to make research reproducible.

While libraries have been building support around research data management and digital scholarship, reproducibility is an emerging area that has yet to be systematically addressed. To respond to this, New York University (NYU) created the position of Librarian for Research Data Management and Reproducibility (RDM & R), a dual appointment between the Center for Data Science (CDS) and the Division of Libraries. This report will outline the role of the RDM & R librarian, paying close attention to the collaboration between the CDS and Libraries to bring reproducible research practices into the norm.

Citation: Steeves, Vicky. “Reproducibility Librarianship.” Collaborative Librarianship 9, no. 2 (2017): 80-89.


Research data management and services: Resources for novice data librarians

Authors: Sarah Barbrow, Denise Brush, and Julie Goldman

Abstract:  Research in many academic fields today generates large amounts of data. These data not only must be processed and analyzed by the researchers, but also managed throughout the data life cycle. Recently, some academic libraries have begun to offer research data management (RDM) services to their communities. Often, this service starts with helping faculty write data management plans, now required by many federal granting agencies. Libraries with more developed services may work with researchers as they decide how to archive and share data once the grant work is complete.

Citation: Barbrow, S., Brush, D., & Goldman, J. (2017). Research data management and services: Resources for novice data librarians. College & Research Libraries News, 78(5), 274.


Advancing research data publishing practices for the social sciences: from archive activity to empowering researchers

Authors: Veerle Van den Eynden, Louise Corti

Abstract: Sharing and publishing social science research data have a long history in the UK, through long-standing agreements with government agencies for sharing survey data and the data policy, infrastructure, and data services supported by the Economic and Social Research Council. The UK Data Service and its predecessors developed data management, documentation, and publishing procedures and protocols that stand today as robust templates for data publishing. As the ESRC research data policy requires grant holders to submit their research data to the UK Data Service after a grant ends, setting standards and promoting them has been essential in raising the quality of the resulting research data being published. In the past, received data were all processed, documented, and published for reuse in-house. Recent investments have focused on guiding and training researchers in good data management practices and skills for creating shareable data, as well as a self-publishing repository system, ReShare. ReShare also receives data sets described in published data papers and achieves scientific quality assurance through peer review of submitted data sets before publication. Social science data are reused for research, to inform policy, in teaching and for methods learning. Over a 10 years period, responsive developments in system workflows, access control options, persistent identifiers, templates, and checks, together with targeted guidance for researchers, have helped raise the standard of self-publishing social science data. Lessons learned and developments in shifting publishing social science data from an archivist responsibility to a researcher process are showcased, as inspiration for institutions setting up a data repository.

Citation: Van den Eynden, V. & Corti, L. (2017). Advancing research data publishing practices for the social sciences: from archive activity to empowering researchers. Int J Digit Libr 18: 113. doi:10.1007/s00799-016-0177-3


Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance

Authors: Gil Yolanda, Cedric H. David, Ibrahim Demir, Bakinam T. Essawy, Robinson W. Fulweiler, Jonathan L. Goodall, Leif Karlstrom, Huikyo Lee, Heath J. Mills, Ji-Hyun Oh, Suzanne A. Pierce, Allen Pope, Mimi W. Tzeng, Sandra R. Villamizar, Xuan Yu

Abstract: Geoscientists now live in a world rich with digital data and methods, and their computational research cannot be fully captured in traditional publications. The Geoscience Paper of the Future (GPF) presents an approach to fully document, share, and cite all their research products including data, software, and computational provenance. This article proposes best practices for GPF authors to make data, software, and methods openly accessible, citable, and well documented. The publication of digital objects empowers scientists to manage their research products as valuable scientific assets in an open and transparent way that enables broader access by other scientists, students, decision makers, and the public. Improving documentation and dissemination of research will accelerate the pace of scientific discovery by improving the ability of others to build upon published work.

Citation: Gil, Y., et all (2016). Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance. Earth and Space Science, 3, 388-415. 


Assessing Research Data Management Practices of Faculty at Carnegie Mellon University

Authors: Steve Van Tuyl , Gabrielle Michalek

Abstract: INTRODUCTION Recent changes to requirements for research data management by federal granting agencies and by other funding institutions have resulted in the emergence of institutional support for these requirements. At CMU, we sought to formalize assessment of research data management practices of researchers at the institution by launching a faculty survey and conducting a number of interviews with researchers. METHODS We submitted a survey on research data management practices to a sample of faculty including questions about data production, documentation, management, and sharing practices. The survey was coupled with in-depth interviews with a subset of faculty. We also make estimates of the amount of research data produced by faculty. RESULTS Survey and interview results suggest moderate level of awareness of the regulatory environment around research data management. Results also present a clear picture of the types and quantities of data being produced at CMU and how these differ among research domains. Researchers identified a number of services that they would find valuable including assistance with data management planning and backup/storage services. We attempt to estimate the amount of data produced and shared by researchers at CMU. DISCUSSION Results suggest that researchers may need and are amenable to assistance with research data management. Our estimates of the amount of data produced and shared have implications for decisions about data storage and preservation. CONCLUSION Our survey and interview results have offered significant guidance for building a suite of services for our institution.

Citation: Tuyl, S.V. & Michalek, G., (2015). Assessing Research Data Management Practices of Faculty at Carnegie Mellon University. Journal of Librarianship and Scholarly Communication. 3(3), p.eP1258. DOI:


Adopting a Distributed Model for Data Services

Authors: Casey Gibbs, Marcos Hernandez, Pongracz Sennyey

Abstract: This article describes how the Saint Edward’s University Library implemented a distributed model for the Institutional Repository. Based on Cloud Based platforms and APIs, the Library has created an Institutional Repository that is scaleable and modular, considerably lowering its implementation and maintenance costs, while lowering its technical complexity.

Casey Gibbs, Marcos Hernandez, Pongracz Sennyey. (2017). Adopting a Distributed Model for Data Services. Code4Lib Journal. Issue 35.


A Data Citation Roadmap for Scholarly Data Repositories

Authors: Martin Fennera, Merce Crosasb, Jeffrey S. Grethec, David Kennedy, Henning Hermjakobe, Phillippe Rocca-Serraf, Robin Berjong, Sebastian Karcherh, Maryann Martonei, Tim Clark


Abstract: This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles (Data Citation Synthesis Group, 2014), a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Early Adopters Expert Group, part of the Data Citation Implementation Pilot (DCIP) project (FORCE11, 2015), an initiative of and the NIH BioCADDIE (2016) program. The roadmap makes 11 specific recommendations, grouped into three phases of implementation: a) required steps needed to support the Joint Declaration of Data Citation Principles, b) recommended steps that facilitate article/data publication workflows, and c) optional steps that further improve data citation support provided by data repositories.


Citation: Fenner, M., Crosas, M., Grethe, J., Kennedy, D., Hermjakob, H., Rocca-Serra, P., … Clark, T. (2016). A Data Citation Roadmap for Scholarly Data Repositories. bioRxiv.