TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publication References
  4. Retrieve, Refine, or Both? Using Task-Specific Guidelines for Secure Python Code Generation
 
Options

Retrieve, Refine, or Both? Using Task-Specific Guidelines for Secure Python Code Generation

Publikationstyp
Conference Paper
Date Issued
2025-09-07
Sprache
English
Author(s)
Tony, Catherine  orcid-logo
Software Security E-22  
Iannone, Emanuele 
Software Security E-22  
Scandariato, Riccardo  
Software Security E-22  
TORE-URI
https://hdl.handle.net/11420/58691
Citation
IEEE International Conference on Software Maintenance and Evolution, ICSME 2025
Contribution to Conference
IEEE International Conference on Software Maintenance and Evolution, ICSME 2025
Publisher DOI
10.1109/icsme64153.2025.00041
Scopus ID
2-s2.0-105022473566
Publisher
IEEE
Large Language Models (LLMs) are increasingly used for code generation, but they often produce code with security vulnerabilities. While techniques like fine-tuning and instruction tuning can improve security, they are computationally expensive and require large amounts of secure code data. Recent studies have explored prompting techniques to enhance code security without additional training. Among these, Recursive Criticism and Improvement (RCI) has demonstrated strong improvements by iteratively refining the generated code by leveraging LLMs' self-critiquing capabilities. However, RCI relies on the model's ability to identify security flaws, which is constrained by its training data and susceptibility to hallucinations. This paper investigates the impact of incorporating taskspecific secure coding guidelines extracted from MITRE's CWE and CodeQL recommendations into LLM prompts. For this, we employ Retrieval-Augmented Generation (RAG) to dynamically retrieve the relevant guidelines that help the LLM avoid generating insecure code. We compare RAG with RCI, observing that both deliver comparable performance in terms of code security, with RAG consuming considerably less time and fewer tokens. Additionally, combining both approaches further reduces the amount of insecure code generated, requiring only slightly more resources than RCI alone, highlighting the benefit of adding relevant guidelines in improving LLM-generated code security.
DDC Class
005.8: Computer Security
TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback