TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publication References
  4. Back to the Roots: Assessing Mining Techniques for Java Vulnerability-Contributing Commits
 
Options

Back to the Roots: Assessing Mining Techniques for Java Vulnerability-Contributing Commits

Publikationstyp
Journal Article
Date Issued
2025-09-24
Sprache
English
Author(s)
Hinrichs, Torge  
Software Security E-22  
Iannone, Emanuele 
Software Security E-22  
Tamás, Aladics  
Péter Hegedűs
De Lucia, Andrea  
Palomba, Fabio  
Scandariato, Riccardo  
Software Security E-22  
TORE-URI
https://hdl.handle.net/11420/58696
Journal
ACM transactions on software engineering and methodology  
Citation
ACM transactions on software engineering and methodology (in Press): (2025)
Publisher DOI
10.1145/3769105
Publisher
Association for Computing Machinery (ACM)
Context: Vulnerability-contributing commits (VCCs) are code changes that introduce vulnerabilities. Mining historical VCCs relies on SZZ-based algorithms that trace from known vulnerability-fixing commits. Objective: Although these techniques have been used, e.g., to train just-in-time vulnerability predictors, they lack systematic benchmarking to evaluate their precision, recall, and error sources. Method: We empirically assessed 12 VCC mining techniques in Java repositories using two benchmark datasets (one from the literature and one newly curated). We also explored combinations of techniques, through intersections, voting schemes, and machine learning, to improve performance. Results: Individual techniques achieved at most 0.60 precision but up to 0.89 recall. The precision rose to 0.75 when the outputs were combined with the logical AND, at the expense of recall. Machine learning ensembles reached 0.80 precision with a better precision–recall balance. Performance varied significantly by dataset. Analyzing “fixing commits” showed that certain fix types (e.g., filtering or sanitization) affect retrieval accuracy, and failure patterns highlighted weaknesses when fixes involve external data handling. Conclusion: Such results help software security researchers select the most suitable mining technique for their studies and understand new ways to design more accurate solutions.
DDC Class
600: Technology
TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback