SVCP4C

From Security Unileon

SonarCloud Vulnerable Code Prospector for C (SVCP4C) is a tool that aims to collect vulnerable source code (written in C) from open-source repositories linked to SonarCloud by using its REST API. The output consists of a set of tagged files suitable for extracting features and creating training datasets for Machine Learning algorithms.

SVCP4C overview

Vulnerabilities are listed in each file using comments appended at the end of each file. Such comments follow the format /// starting_line,starting_offset;ending_line,ending_offset (with offset being the column). For example:

/// ###BEGIN_VULNERABLE_LINES###

/// 1126,3;1126,9

/// 1153,9;1153,15

/// 1341,9;1341,15

/// 1734,6;1734,12

Built With

Reference

To cite this work, please use the following BibTeX entry:

@ARTICLE{Raducu2020,
  Title     = {Collecting Vulnerable Source Code from Open-Source Repositories for Dataset Generation},
  Author    = {Raducu, Razvan and Esteban, Gonzalo and Rodr{\'i}guez Lera, Francisco Javier and Fern{\'a}ndez, Camino},
  Journal   = {Applied Sciences},
  Volume    = {10},
  Number    = {4},
  Pages     = {1270},
  Year      = {2020},
  Publisher = {Multidisciplinary Digital Publishing Institute},
  Doi       = {https://doi.org/10.3390/app10041270},
}

License

This project is licensed under GNU GPLv3.

External links

  1. SonarCloud Vulnerable Code Prospector for C (SVCP4C), (2020), GitHub repository, https://github.com/uleroboticsgroup/SVCP4C
  2. Vulnerable Source Code Collected from Open Source Repositories for Dataset Generation, (2020), GitHub repository, https://github.com/uleroboticsgroup/SVCP4CDataset