A Curated Dataset of Security Defects in Scientific Software Projects

Justin Murphy, Elias Brady, Shazibul Islam Shamim, and Akond Rahman in 7th Annual Hot Topics in the Science of Security (HoTSoS) Symposium, 2020 Pre-print

The cybersecurity research community might benefit from a curated dataset where commits mined from scientific software projects are labeled as security defects. We constructed a curated security defect dataset by mining 7,024 commits from 20 scientific software projects. Our dataset can be beneficial for cybersecurity researchers in two ways: (i) use the dataset to conduct security defect categorization and prediction research; and (ii) find undiscovered security defects in scientific software projects.