OpenDORS: A dataset of openly referenced open research software
By: Stephan Druskat, Lars Grunske
Potential Business Impact:
Helps scientists build better tools for discoveries.
In many academic disciplines, software is created during the research process or for a research purpose. The crucial role of software for research is increasingly acknowledged. The application of software engineering to research software has been formalized as research software engineering, to create better software that enables better research. Despite this, large-scale studies of research software and its development are still lacking. To enable such studies, we present a dataset of 134,352 unique open research software projects and 134,154 source code repositories referenced in open access literature. Each dataset record identifies the referencing publication and lists source code repositories of the software project. For 122,425 source code repositories, the dataset provides metadata on latest versions, license information, programming languages and descriptive metadata files. We summarize the distributions of these features in the dataset and describe additional software metadata that extends the dataset in future work. Finally, we suggest examples of research that could use the dataset to develop a better understanding of research software practice in RSE research.
Similar Papers
Evaluating Software Supply Chain Security in Research Software
Software Engineering
Makes science software safer from hackers.
SQuaD: The Software Quality Dataset
Software Engineering
Helps find software problems faster and earlier.
Scientific Open-Source Software Is Less Likely to Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software
Software Engineering
Helps science software last much longer.