Discovering Scholarly Orphans Using ORCID

Authors: Martin Klein, Herbert Van de Sompel

Abstract: Archival efforts such as (C)LOCKSS and Portico are in place to ensure the longevity of traditional scholarly resources like journal articles. At the same time, researchers are depositing a broad variety of other scholarly artifacts into emerging online portals that are designed to support web-based scholarship. These web-native scholarly objects are largely neglected by current archival practices and hence they become scholarly orphans. We therefore argue for a novel paradigm that is tailored towards archiving these scholarly orphans. We are investigating the feasibility of using Open Researcher and Contributor ID (ORCID) as a supporting infrastructure for the process of discovery of web identities and scholarly orphans for active researchers. We analyze ORCID in terms of coverage of researchers, subjects, and location and assess the richness of its profiles in terms of web identities and scholarly artifacts. We find that ORCID currently lacks in all considered aspects and hence can only be considered in conjunction with other discovery sources. However, ORCID is growing fast so there is potential that it could achieve a satisfactory level of coverage and richness in the near future.

Citation: Martin Klein and Herbert Van de Sompel. 2016. Discovering Scholarly Orphans Using ORCID. In Proceedings of ACM Conference, Washington, DC, USA, July 2017 (Conference’17), 10 pages.


arXiv e-prints and the journal of record: An analysis of roles and relationships

Authors: Vincent Larivière, Cassidy R. Sugimoto, Benoit Macaluso, Staša Milojević, Blaise Cronin, and Mike Thelwall

Abstract: Since its creation in 1991, arXiv has become central to the diffusion of research in a number of fields. Combining data from the entirety of arXiv and the Web of Science (WoS), this paper investigates (a) the proportion of papers across all disciplines that are on arXiv and the proportion of arXiv papers that are in the WoS, (b) elapsed time between arXiv submission and journal publication, and (c) the aging characteristics and scientific impact of arXiv e-prints and their published version. It shows that the proportion of WoS papers found on arXiv varies across the specialties of physics and mathematics, and that only a few specialties make extensive use of the repository. Elapsed time between arXiv submission and journal publication has shortened but remains longer in mathematics than in physics. In physics, mathematics, as well as in astronomy and astrophysics, arXiv versions are cited more promptly and decay faster than WoS papers. The arXiv versions of papers — both published and unpublished — have lower citation rates than published papers, although there is almost no difference in the impact of the arXiv versions of both published and unpublished papers.

Citation: Larivière, V., Sugimoto, C. R., Macaluso, B., Milojević, S., Cronin, B. and Thelwall, M. (2014), arXiv E-prints and the journal of record: An analysis of roles and relationships. J Assn Inf Sci Tec, 65: 1157–1169. doi:10.1002/asi.23044, arXiv:1306.3261


Adapting sentiment analysis for tweets linking to scientific papers

Authors: Natalie Friedrich, Timothy D. Bowman, Wolfgang G. Stock, Stefanie Haustein

Abstract: In the context of altmetrics, tweets have been discussed as potential indicators of immediate and broader societal impact of scientific documents. However, it is not yet clear to what extent Twitter captures actual research impact. A small case study (Thelwall et al., 2013b) suggests that tweets to journal articles neither comment on nor express any sentiments towards the publication, which suggests that tweets merely disseminate bibliographic information, often even automatically. This study analyses the sentiments of tweets for a large representative set of scientific papers by specifically adapting different methods to academic articles distributed on Twitter. Results will help to improve the understanding of Twitter’s role in scholarly communication and the meaning of tweets as impact metrics.

Citation: Natalie Friedrich, Timothy D. Bowman, Wolfgang G. Stock, Stefanie Haustein. (2015). Adapting sentiment analysis for tweets linking to scientific papers. arxiv


The role of handbooks in knowledge creation and diffusion: A case of science and technology studies

Authors: Staša Milojević, Cassidy R. Sugimoto, Vincent Larivière, Mike Thelwall, Ying Ding

Abstract: Genre is considered to be an important element in scholarly communication and in the practice of scientific disciplines. However, scientometric studies have typically focused on a single genre, the journal article. The goal of this study is to understand the role that handbooks play in knowledge creation and diffusion and their relationship with the genre of journal articles, particularly in highly interdisciplinary and emergent social science and humanities disciplines. To shed light on these questions we focused on handbooks and journal articles published over the last four decades belonging to the research area of Science and Technology Studies (STS), broadly defined. To get a detailed picture we used the full-text of five handbooks (500,000 words) and a well-defined set of 11,700 STS articles. We confirmed the methodological split of STS into qualitative and quantitative (scientometric) approaches. Even when the two traditions explore similar topics (e.g., science and gender) they approach them from different starting points. The change in cognitive foci in both handbooks and articles partially reflects the changing trends in STS research, often driven by technology. Using text similarity measures we found that, in the case of STS, handbooks play no special role in either focusing the research efforts or marking their decline. In general, they do not represent the summaries of research directions that have emerged since the previous edition of the handbook.

Citation: Staša Milojević, Cassidy R. Sugimoto, Vincent Larivière, Mike Thelwall, Ying Ding. (2014). The role of handbooks in knowledge creation and diffusion: A case of science and technology studies. arxiv


Tweets as impact indicators: Examining the implications of automated bot accounts on Twitter

Authors: Stefanie Haustein, Timothy D. Bowman, Kim Holmberg, Andrew Tsou, Cassidy R. Sugimoto, Vincent Larivière

Abstract: This brief communication presents preliminary findings on automated Twitter accounts distributing links to scientific papers deposited on the preprint repository arXiv. It discusses the implication of the presence of such bots from the perspective of social media metrics (altmetrics), where mentions of scholarly documents on Twitter have been suggested as a means of measuring impact that is both broader and timelier than citations. We present preliminary findings that automated Twitter accounts create a considerable amount of tweets to scientific papers and that they behave differently than common social bots, which has critical implications for the use of raw tweet counts in research evaluation and assessment. We discuss some definitions of Twitter cyborgs and bots in scholarly communication and propose differentiating between different levels of engagement from tweeting only bibliographic information to discussing or commenting on the content of a paper.

Citation: Stefanie Haustein, Timothy D. Bowman, Kim Holmberg, Andrew Tsou, Cassidy R. Sugimoto, Vincent Larivière. (2014). Tweets as impact indicators: Examining the implications of automated bot accounts on Twitter. arXiv


Linking Mathematical Software in Web Archives

Authors: Helge Holzmann, Mila Runnwerth, Wolfram Sperber

Abstract: The Web is our primary source of all kinds of information today. This includes information about software as well as associated materials, like source code, documentation, related publications and change logs. Such data is of particular importance in research in order to conduct, comprehend and reconstruct scientific experiments that involve software. swMATH, a mathematical software directory, attempts to identify software mentions in scientific articles and provides additional information as well as links to the Web. However, just like software itself, the Web is dynamic and most likely the information on the Web has changed since it was referenced in a scientific publication. Therefore, it is crucial to preserve the resources of a software on the Web to capture its states over time.

We found that around 40% of the websites in swMATH are already included in an existing Web archive. Out of these, 60% of contain some kind of documentation and around 45% even provide downloads of software artifacts. Hence, already today links can be established based on the publication dates of corresponding articles. The contained data enable enriching existing information with a temporal dimension. In the future, specialized infrastructure will improve the coverage of software resources and allow explicit references in scientific publications.

Citation: Helge Holzmann, Mila Runnwerth, Wolfram Sperber. (2017). Linking Mathematical Software in Web Archives. arxiv


Single versus Double Blind Reviewing at WSDM 2017

Authors: Andrew Tomkins, Min Zhang, William D. Heavlin

Abstract: In this paper we study the implications for conference program committees of adopting single-blind reviewing, in which committee members are aware of the names and affiliations of paper authors, versus double-blind reviewing, in which this information is not visible to committee members. WSDM 2017, the 10th ACM International ACM Conference on Web Search and Data Mining, performed a controlled experiment in which each paper was reviewed by four committee members. Two of these four reviewers were chosen from a pool of committee members who had access to author information; the other two were chosen from a disjoint pool who did not have access to this information. This information asymmetry persisted through the process of bidding for papers, reviewing papers, and entering scores. Reviewers in the single-blind condition typically bid for 26% more papers, and bid preferentially for papers from top institutions. Once papers were allocated to reviewers, single-blind reviewers were significantly more likely than their double-blind counterparts to recommend for acceptance papers from famous authors and top institutions. In each case, the estimated odds multiplier is around $1.5times$, so the result is quite strong. We did not however see differences in bidding or reviewing behavior between single-blind and double-blind reviewers for papers with female authors. We describe our findings in detail and offer some recommendations.

Andrew Tomkins, Min Zhang, William D. Heavlin. (2017).  Single versus Double Blind Reviewing at WSDM 2017. arxiv



Archiving Software Surrogates on the Web for Future Reference

Authors: Helge Holzmann, Wolfram Sperber, Mila Runnwerth

Abstract: Software has long been established as an essential aspect of the scientific process in mathematics and other disciplines. However, reliably referencing software in scientific publications is still challenging for various reasons. A crucial factor is that software dynamics with temporal versions or states are difficult to capture over time. We propose to archive and reference surrogates instead, which can be found on the Web and reflect the actual software to a remarkable extent. Our study shows that about a half of the webpages of software are already archived with almost all of them including some kind of documentation.

Holzmann H, Sperber W & Runnwerth M. (2017). Archiving Software Surrogates on the Web for Future Reference. arxiv



How many scientific papers are mentioned in policy-related documents?

Authors: Robin Haunschild, Lutz Bornmann

Abstract: In this short communication, we provide an overview of a relatively newly provided source of altmetrics data which could possibly be used for societal impact measurements in scientometrics. Recently, Altmetric – a start-up providing publication level metrics – started to make data for publications available which have been mentioned in policy-related documents. Using data from Altmetric, we study how many papers indexed in the Web of Science (WoS) are mentioned in policy-related documents. We find that less than 0.5% of the papers published in different subject categories are mentioned at least once in policy-related documents. Based on our results, we recommend that the analysis of (WoS) publications with at least one policy-related mention is repeated regularly (annually). Mentions in policy-related documents should not be used for impact measurement until new policy-related sites are tracked.

Haunschild R & Bornmann L. (2016). How many scientific papers are mentioned in policy-related documents? An empirical investigation using Web of Science and Altmetric data. Preprint.


The sum of it all: revealing collaboration patterns by combining authorship and acknowledgements

Authors: Adele Paul-Hus, Philippe Mongeon, Maxime Sainte-Marie, Vincent Lariviere

Abstract: Acknowledgments are one of many conventions by which researchers publicly bestow recognition towards individuals, organizations and institutions that contributed in some way to the work that led to publication. Combining data on both co-authors and acknowledged individuals, the present study analyses disciplinary differences in researchers credit attribution practices in collaborative context. Our results show that the important differences traditionally observed between disciplines in terms of team size are greatly reduced when acknowledgees are taken into account. Broadening the measurement of collaboration beyond co-authorship by including individuals credited in the acknowledgements allows for an assessment of collaboration practices and team work that might be closer to the reality of contemporary research, especially in the social sciences and humanities.

Citation: Adele Paul-Hus, Philippe Mongeon, Maxime Sainte-Marie, Vincent Lariviere. (2016). The sum of it all: revealing collaboration patterns by combining authorship and acknowledgements. Journal of Informetrics. doi: