TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publications
  4. Data-efficient vision transformers for multi-label sisease classification on chest radiographs
 
Options

Data-efficient vision transformers for multi-label sisease classification on chest radiographs

Citation Link: https://doi.org/10.15480/882.4777
Publikationstyp
Journal Article
Date Issued
2022-07
Sprache
English
Author(s)
Behrendt, Finn  
Bhattacharya, Debayan 
Krüger, Julia  
Opfer, Roland  
Schlaefer, Alexander  
Institut
Medizintechnische und Intelligente Systeme E-1  
TORE-DOI
10.15480/882.4777
TORE-URI
http://hdl.handle.net/11420/13482
Journal
Current directions in biomedical engineering  
Volume
8
Issue
1
Start Page
34
End Page
37
Citation
Current Directions in Biomedical Engineering 8 (1): 34-37 (2022-07)
Publisher DOI
10.1515/cdbme-2022-0009
Scopus ID
2-s2.0-85135571480
Publisher
De Gruyter
Radiographs are a versatile diagnostic tool for the detection and assessment of pathologies, for treatment planning or for navigation and localization purposes in clinical interventions. However, their interpretation and assessment by radiologists can be tedious and error-prone. Thus, a wide variety of deep learning methods have been proposed to support radiologists interpreting radiographs. Mostly, these approaches rely on convolutional neural networks (CNN) to extract features from images. Especially for the multi-label classification of pathologies on chest radiographs (Chest X-Rays, CXR), CNNs have proven to be well suited. On the Contrary, Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images and interpretable local saliency maps which could add value to clinical interventions. ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present. While this leads to increased capacity, ViTs typically require an excessive amount of training data which represents a hurdle in the medical domain as high costs are associated with collecting large medical data sets. In this work, we systematically compare the classification performance of ViTs and CNNs for different data set sizes and evaluate more data-efficient ViT variants (DeiT). Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
Subjects
Chest Radiograph
CheXpert
Convolutional Neural Network
Deep Learning
Vision Transformer
MLE@TUHH
DDC Class
600: Technik
610: Medizin
Funding(s)
Vollautomatische, strukturierte Befundung von Röntgen-Thoray-Aufnahmen für die Routineanwendung in der Patientenversorgung  
Publication version
publishedVersion
Lizenz
https://creativecommons.org/licenses/by/4.0/
Loading...
Thumbnail Image
Name

10.1515_cdbme-2022-0009.pdf

Size

986.8 KB

Format

Adobe PDF

TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback