SonarCloud Vulnerable Code Prospector for C (SVCP4C) is a tool that aims to collect vulnerable source code (written in C) from open-source repositories linked to SonarCloud by using its REST API. The output consists of a set of tagged files suitable for extracting features and creating training datasets for Machine Learning algorithms.
Vulnerabilities are listed in each file using comments appended at the end of each file. Such comments follow the format /// starting_line,starting_offset;ending_line,ending_offset
(with offset being the column). For example:
/// ###BEGIN_VULNERABLE_LINES###
/// 1126,3;1126,9
/// 1153,9;1153,15
/// 1341,9;1341,15
/// 1734,6;1734,12
References
- Raducu, R., Esteban, G., Rodríguez Lera, F. J., & Fernández, C. (2020). Collecting Vulnerable Source Code from Open-Source Repositories for Dataset Generation. Applied Sciences, 10 (4), 1270. DOI: https://doi.org/10.3390/app10041270