Defect Prediction Metrics for Infrastructure as Code Scripts in DevOps

Akond Rahman, Jonathan Stallings, and Laurie Williams in International Conference on Software Engineering (ICSE), 2018 Pre-print

Use of infrastructure as code (IaC) scripts helps software teams manage their configuration and infrastructure automatically. Information technology (IT) organizations use IaC scripts to create and manage automated deployment pipelines to deliver services rapidly. IaC scripts can be defective, resulting in dire consequences, such as creating wide-scale service outages for end-users. Prediction of defective IaC scripts can help teams to mitigate defects in these scripts by prioritizing their inspection efforts. The goal of this paper is to help software practitioners in prioritizing their inspection efforts for infrastructure as code (IaC) scripts by proposing defect prediction model-related metrics. IaC scripts use domain specific languages (DSL) that are fundamentally different from object-oriented programming (OOP) languages. Hence, the OOP-based metrics that researchers used in defect prediction might not be applicable for IaC scripts. We apply Constructivist Grounded Theory (CGT) on defect-related commits mined from version control systems to identify metrics suitable for IaC scripts. By applying CGT, we identify 18 metrics. Of these metrics, 13 are related to IaC, for example, count of string occurrences in a script. Four of the identified metrics are related to churn, and one metric is lines of code.