Reproducible research in descriptive linguistics: integrating archiving and citation into the postgraduate curriculum at the University of Hawai’i at Mānoa

Author: Andrea L. Berez

Abstract: The notion of reproducible research has received considerable attention in recent years from physical scientists, life scientists, social and behavioural scientists, and computational scientists. Some readers will be familiar with the criterion of replicability as a tenet of good execution of the scientific method, in which sound scientific experiments or studies are those that can be recreated elsewhere leading to new data, and in which sound scientific claims are those that are confirmed by the new data in a replicated study.

Citation: Berez, A. (2015). Reproducible research in descriptive linguistics: integrating archiving and citation into the postgraduate curriculum at the University of Hawai’i at Mānoa. In A. Harris, N. Thieberger & L. Barwick (Eds.) ‘Research, records and responsibility: ten years of PARADISEC’ (pp. 39-51). Sydney: Sydney University Press.


Geographic variation in social media metrics: an analysis of Latin American journal articles

Author: Juan Pablo Alperin

Purpose: This study aims to contribute to the understanding of how the potential of altmetrics varies around the world by measuring the percentage of articles with non-zero metrics (coverage) for articles published from a developing region (Latin America).

Design/methodology/approach: This study uses article metadata from a prominent Latin American journal portal, SciELO, and combines it with altmetrics data from and with data collected by author-written scripts. The study is primarily descriptive, focusing on coverage levels disaggregated by year, country, subject area, and language.

Findings: Coverage levels for most of the social media sources studied was zero or negligible. Only three metrics had coverage levels above 2%—Mendeley, Twitter, and Facebook. Of these, Twitter showed the most significant differences with previous studies. Mendeley coverage levels reach those found by previous studies, but it takes up to two years longer for articles to be saved in the reference manager. For the most recent year, coverage was less than half than what was found in previous studies. The coverage levels of Facebook appear similar (around 3%) to that of previous studies.

Research limitations/implications: The data used for some of the analyses was collected for a six month period. For other analyses, data was only available for a single country (Brazil).

Originality/value: The results of this study have implications for the altmetrics research community and for any stakeholders interested in using altmetrics for evaluation. It suggests the need of careful sample selection when wishing to make generalizable claims about altmetrics.

Citation: Juan Pablo Alperin, (2015) “Geographic variation in social media metrics: an analysis of Latin American journal articles”, Aslib Journal of Information Management, Vol. 67 Issue: 3, pp.289-304, doi: 10.1108/AJIM-12-2014-0176


Adapting sentiment analysis for tweets linking to scientific papers

Authors: Natalie Friedrich, Timothy D. Bowman, Wolfgang G. Stock, Stefanie Haustein

Abstract: In the context of altmetrics, tweets have been discussed as potential indicators of immediate and broader societal impact of scientific documents. However, it is not yet clear to what extent Twitter captures actual research impact. A small case study (Thelwall et al., 2013b) suggests that tweets to journal articles neither comment on nor express any sentiments towards the publication, which suggests that tweets merely disseminate bibliographic information, often even automatically. This study analyses the sentiments of tweets for a large representative set of scientific papers by specifically adapting different methods to academic articles distributed on Twitter. Results will help to improve the understanding of Twitter’s role in scholarly communication and the meaning of tweets as impact metrics.

Citation: Natalie Friedrich, Timothy D. Bowman, Wolfgang G. Stock, Stefanie Haustein. (2015). Adapting sentiment analysis for tweets linking to scientific papers. arxiv


Assessing Research Data Management Practices of Faculty at Carnegie Mellon University

Authors: Steve Van Tuyl , Gabrielle Michalek

Abstract: INTRODUCTION Recent changes to requirements for research data management by federal granting agencies and by other funding institutions have resulted in the emergence of institutional support for these requirements. At CMU, we sought to formalize assessment of research data management practices of researchers at the institution by launching a faculty survey and conducting a number of interviews with researchers. METHODS We submitted a survey on research data management practices to a sample of faculty including questions about data production, documentation, management, and sharing practices. The survey was coupled with in-depth interviews with a subset of faculty. We also make estimates of the amount of research data produced by faculty. RESULTS Survey and interview results suggest moderate level of awareness of the regulatory environment around research data management. Results also present a clear picture of the types and quantities of data being produced at CMU and how these differ among research domains. Researchers identified a number of services that they would find valuable including assistance with data management planning and backup/storage services. We attempt to estimate the amount of data produced and shared by researchers at CMU. DISCUSSION Results suggest that researchers may need and are amenable to assistance with research data management. Our estimates of the amount of data produced and shared have implications for decisions about data storage and preservation. CONCLUSION Our survey and interview results have offered significant guidance for building a suite of services for our institution.

Citation: Tuyl, S.V. & Michalek, G., (2015). Assessing Research Data Management Practices of Faculty at Carnegie Mellon University. Journal of Librarianship and Scholarly Communication. 3(3), p.eP1258. DOI:


“Scholarship is a Conversation”: Discourse, Attribution, and Twitter’s Role in Information Literacy Instruction | The Journal of Creative Library Practice

Authors: Carroll AJ and Dasler R

Abstract: When addressing scholarly attribution, citation, and plagiarism in one-shot instruction sessions, librarians often fail to present these issues in a manner that has relevance for students. Librarians often focus on intellectual honesty and the potential ramifications of plagiarism, both individual pursuits, rather than explaining that by creating an academic work, students are participating in academic discourse. Within Pluralizing Plagiarism, Anson argues that scholarly attribution instruction that emphasizes “policy, detection, and punishment” is antithetical to the mission of institutions of higher learning – the education of students (Anson, 2008). One of the major deficiencies of this compliance-based instruction is that it presents students with a false dichotomy that does not align with their authentic life experiences; plagiarism is demonstrated as a black and white issue, rather than existing in shades of gray. Students who have come of age within a twenty-first century information ecosystem rife with remix and parody culture will likely find teaching that presents the re-use of source material as a non-nuanced issue unconvincing. Because students respond positively to instruction that aligns with their authentic experiences, this suggests that librarians need to develop new methods for teaching attribution and scholarly discourse that not only recognize the nuance inherent to these topics, but also presents these concepts within a familiar framework (Klipfel, 2014). As a familiar platform for social interaction with multiple avenues for giving credit and a shorter timescale, Twitter presents an opportunity to place attribution, plagiarism, and integrity into a humanizing, real world context that models how discourse unfolds in an authentic manner for learners. By embedding attribution instruction into a meaningful context, librarians and other educators can make substantial and much needed improvements to traditional compliance-based instruction, which is often built upon the slow, rigid, and unfamiliar patterns of how to cite scholarly works.

Citation: Carroll AJ & Dasler R. (2015). “Scholarship is a Conversation”: Discourse, Attribution, and Twitter’s Role in Information Literacy Instruction”. The Journal of Creative Library Practice.


Research Data Services in Academic Libraries: Data Intensive Roles for the Future?

Authors: Carol Tenopir, Dane Hughes, Suzie Allard, Mike Frame, Ben Birch, Lynn Baird, Robert Sandusky, Madison Langseth, and Andrew Lundeen


Abstract: Objectives: The primary objectives of this study are to gauge the various levels of Research Data Service academic libraries provide based on demographic factors, gauging RDS growth since 2011, and what obstacles may prevent expansion or growth of services.Methods: Survey of academic institutions through stratified random sample of ACRL library directors across the U.S. and Canada. Frequencies and chi-square analysis were applied, with some responses grouped into broader categories for analysis.

Results: Minimal to no change for what services were offered between survey years, and interviews with library directors were conducted to help explain this lack of change.

Conclusion: Further analysis is forthcoming for a librarians study to help explain possible discrepancies in organizational objectives and librarian sentiments of RDS.


Citation: Tenopir, C, Hughes, D, Allard, S, Frame, M, Birch, B, Baird, L., Sandusky, R, Langseth, M, & Lundeen, A (2015) Research Data Services in Academic Libraries: Data Intensive Roles for the FutureJournal of eScience Librarianship 4(2): e1085.




Analyzing data citation practices using the Data Citation Index

Authors: Nicolas Robinson-Garcia, Evaristo Jiménez-Contreras, Daniel Torres-Salinas

Abstract: We present an analysis of data citation practices based on the Data Citation Index from Thomson Reuters. This database launched in 2012 aims to link data sets and data studies with citations received from the other citation indexes. The DCI harvests citations to research data from papers indexed in the Web of Science. It relies on the information provided by the data repository as data citation practices are inconsistent or inexistent in many cases. The findings of this study show that data citation practices are far from common in most research fields. Some differences have been reported on the way researchers cite data: while in the areas of Science and Engineering and Technology data sets were the most cited, in Social Sciences and Arts and Humanities data studies play a greater role. A total of 88.1 percent of the records have received no citation, but some repositories show very low uncitedness rates. Although data citation practices are rare in most fields, they have expanded in disciplines such as crystallography and genomics. We conclude by emphasizing the role that the DCI could play in encouraging the consistent, standardized citation of research data; a role that would enhance their value as a means of following the research process from data collection to publication.

Citation: Nicolas Robinson-Garcia, Evaristo Jiménez-Contreras, Daniel Torres-Salinas. (2015).  Analyzing data citation practices using the Data Citation Index. JASIST. doi:


When is an article actually published? An analysis of online availability, publication, and indexation dates

Authors: Stefanie Haustein, Timothy D. Bowman, Rodrigo Costas

Abstract: With the acceleration of scholarly communication in the digital era, the publication year is no longer a sufficient level of time aggregation for bibliometric and social media indicators. Papers are increasingly cited before they have been officially published in a journal issue and mentioned on Twitter within days of online availability. In order to find a suitable proxy for the day of online publication allowing for the computation of more accurate benchmarks and fine-grained citation and social media event windows, various dates are compared for a set of 58,896 papers published by Nature Publishing Group, PLOS, Springer and Wiley-Blackwell in 2012. Dates include the online date provided by the publishers, the month of the journal issue, the Web of Science indexing date, the date of the first tweet mentioning the paper as well as the publication and first-seen dates. Comparing these dates, the analysis reveals that large differences exist between publishers, leading to the conclusion that more transparency and standardization is needed in the reporting of publication dates. The date on which the fixed journal article (Version of Record) is first made available on the publisher’s website is proposed as a consistent definition of the online date.

Citation: Stefanie Haustein, Timothy D. Bowman, Rodrigo Costas. (2015). When is an article actually published? An analysis of online availability, publication, and indexation dates. In Proceedings of the 15th International Society of Scientometrics and Informetrics Conference (pp. 1170–1179). Istanbul, Turkey.


‘Total cost of ownership’ of scholarly communication: managing subscription and APC payments together

Author: Lawson, Stuart

Abstract: Managing subscription journals and open access charges together has created challenges which may in part be dealt with by offsetting the two revenue streams against each other. In order to do this, it is necessary to have reliable financial data about the extent of the two interacting markets. Jisc Collections has been undertaking data collection regarding universities’ article publication charge (APC) expenditure. This process is difficult without a standardized way of recording data, so Jisc Collections has developed a standard data collection template and is helping institutions to release data openly. If available data become more comprehensive and transparent, then all parties (libraries, publishers, research funders, and intermediaries) will have better knowledge of the APC market and can more accurately predict the effects of offsetting.

Citation: Lawson, S. (2015). ‘Total cost of ownership’ of scholarly communication: managing subscription and APC payments together. Learned Publishing, 28(1). doi:


The Oligopoly of Academic Publishers in the Digital Era

Authors: Larivière V, Haustein S, Mongeon P

Abstract: The consolidation of the scientific publishing industry has been the topic of much debate within and outside the scientific community, especially in relation to major publishers’ high profit margins. However, the share of scientific output published in the journals of these major publishers, as well as its evolution over time and across various disciplines, has not yet been analyzed. This paper provides such analysis, based on 45 million documents indexed in the Web of Science over the period 1973-2013. It shows that in both natural and medical sciences (NMS) and social sciences and humanities (SSH), Reed-Elsevier, Wiley-Blackwell, Springer, and Taylor & Francis increased their share of the published output, especially since the advent of the digital era (mid-1990s). Combined, the top five most prolific publishers account for more than 50% of all papers published in 2013. Disciplines of the social sciences have the highest level of concentration (70% of papers from the top five publishers), while the humanities have remained relatively independent (20% from top five publishers). NMS disciplines are in between, mainly because of the strength of their scientific societies, such as the ACS in chemistry or APS in physics. The paper also examines the migration of journals between small and big publishing houses and explores the effect of publisher change on citation impact. It concludes with a discussion on the economics of scholarly publishing.

Citation: Larivière V, Haustein S, Mongeon P (2015) The Oligopoly of Academic Publishers in the Digital Era. PLoS ONE 10(6): e0127502. doi:10.1371/journal.pone.0127502