TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publication References
  4. Distilling expert surgical knowledge: how to train local surgical VLMs for anatomy explanation in complete mesocolic excision
 
Options

Distilling expert surgical knowledge: how to train local surgical VLMs for anatomy explanation in complete mesocolic excision

Publikationstyp
Conference Paper
Date Issued
2026-06-05
Sprache
English
Author(s)
Maack, Lennart  
Medizintechnische und Intelligente Systeme E-1  
Graß, Julia Kristin
Toscha, Lisa Marie
Melling, Nathaniel
Schlaefer, Alexander  
Medizintechnische und Intelligente Systeme E-1  
TORE-URI
https://hdl.handle.net/11420/63774
Volume
2026-April
Citation
23rd IEEE International Symposium on Biomedical Imaging, ISBI 2026
Contribution to Conference
23rd IEEE International Symposium on Biomedical Imaging, ISBI 2026  
Publisher DOI
10.1109/ISBI61048.2026.11515407
Scopus ID
2-s2.0-105041612014
Publisher
IEEE
ISBN of container
979-833157763-6
Recently, Vision Large Language Models (VLMs) have demonstrated high potential in computer-aided diagnosis and decision-support. However, current VLMs show deficits in domain specific surgical scene understanding, such as identifying and explaining anatomical landmarks during Complete Mesocolic Excision. Additionally, there is a need for locally deployable models to avoid patient data leakage to large VLMs, hosted outside the clinic. We propose a privacy-preserving framework to distill knowledge from large, general-purpose LLMs into an efficient, local VLM. We generate an expert-supervised dataset by prompting a teacher LLM without sensitive images, using only textual context and binary segmentation masks for spatial information. This dataset is used for Supervised Fine-Tuning (SFT) and subsequent Direct Preference Optimization (DPO) of the locally deployable VLM. Our evaluation confirms that finetuning VLMs with our generated datasets increases surgical domain knowledge compared to its base VLM by a large margin. Overall, this work validates a data-efficient and privacy-conforming way to train a surgical domain optimized, locally deployable VLM for surgical scene understanding.
DDC Class
600: Technology
TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback