CVPR 2021 Tutorial on

When Image Analysis Meets Natural Language Processing: A Case Study in Radiology

Slides and recorded videos will be provided on this webpage.
Time: TBD



Recently, the emerging techniques of deep learning have been widely and successfully applied to many different computer vision and text-mining tasks. However, when adopted in a certain domain, such as radiology, these techniques should be combined with extensive domain knowledge to improve efficiency and accuracy. There is, therefore, a critical need to take advantage of medical image analysis, clinical text-mining, and deep learning to better understand the radiological world, and promise to enhance clinical communication and patient-centric care.

The tutorial aims to bridge the gap between medical imaging and medical informatics research, facilitate collaborations between the communities, and introduce new paradigms of machine learning exploiting the latest innovations across domains. It will cover the basics of medical image analysis and clinical text-mining with concrete examples, as well as deep learning algorithms. The audience will also have the opportunity to get the cutting-edge examples of recent advancements in deep learning architectures adopted to the medical domain. Specifically, the tutorial deliberations will be on the following themes.

Tentative Schedule

30 mins. Clinical context for medical imaging deep learning models: model fusion techniques to combine medical imaging with structured clinical data. Matthew Lungren.

Advancements in computer vision and deep learning techniques carry the potential to make significant contributions to healthcare, particularly in fields that utilize medical imaging for diagnosis, prognosis, and treatment decisions. The current state of the art deep learning models for automated diagnosis and outcome prediction using medical imaging tend not to consider patient electronic medical records and clinical variables. Pertinent and accurate information regarding the current symptoms and past medical history enables physicians to interpret imaging findings in the appropriate clinical context, leading to higher diagnostic accuracy, better clinical predictions, and ideally a better outcome for the patient. Deep learning models that use images without clinical context will ultimately be limited, just as physicians cannot be effective without the knowledge of the patient's medical records. In this session, we describe different fusion techniques applied to combine medical imaging with other clinical data, and systematically review literature in this field. We will also present current knowledge and summarize the key insights that researchers can use to make advancements in deep learning for medicine. [slides] [recorded video]

30 mins. Deep learning for cardiovascular imaging applications. Subhi Al'Aref.

Clinical practice has been revolutionized through advancements in non-invasive imaging modalities. In particular, the field of cardiovascular medicine has witnessed widespread adoption of computed tomography (CT), magnetic resonance imaging (MRI), echocardiography, and nuclear perfusion imaging for day-to-day clinical practice. The incorporation of an imaging approach in the diagnostic and prognostic process has allowed for the institution of precise, patient-oriented and disease-targeted cardiovascular therapies. Nevertheless, manual review and interpretation of increasing volumes of cardiovascular imaging studies have been associated with an inefficient workflow, significant time requirements as well as significant intra- and inter-reader variability. The aim of this session is to share the general methods by which deep learning models have been used for segmentation and classification tasks within the realm of cardiovascular imaging, with an emphasis on the advantages of incorporating artificial intelligence and deep learning within a clinical workflow. [slides] [recorded video]

30 mins. Building high performance chest x-ray classification models and understanding why they are all wrong . Alistair Johnson.

The high performance of modern computer vision methods has resulted in considerable interest in applications to radiology. To galvanize research in this area, a number of research groups have released large publicly available datasets, particularly for chest radiographs that benefit from large resources such as NIH ChestX-ray14, CheXpert, PadChest, and MIMIC-CXR. These images particularly benefit from a free-text interpretation provided by a practicing domain expert, which provides a human interpretable label of the image. However, caution must be taken when developing models using data acquired during routine clinical practice. A number of implicit biases exist: the acquisition of the image is based on clinical need, the interpretation of the image is a response to a specific clinical question, and the structuring of the data is not intended for retrospective research. In this tutorial, we will build high-performance computer vision models using large publicly available datasets. We will evaluate the performance of these classifiers on distinct institutions, and highlight generalization issues. We further use class-dependent model interpretation methods to inspect our classifier and highlight the source of its biases. We will end with suggestions for researchers who aim to build machine learning models on retrospectively collected clinical data. [slides] [recorded video]

30 mins. Break.

30 mins. Natural language processing on radiology reports to generate large labeled dataset. Imon Banerjee.

Healthcare institutions have millions of imaging studies associated with unstructured free text radiology reports that describe imaging findings in the radiologist's language who read the study. But, there are no reliable methods for leveraging these reports to create structured labels for training deep learning models. We will present semantic word embedding methods trained on a large cohort of narrative reports and can mine information from free-text radiology reports. It has been successfully applied to hemorrhage and pulmonary embolism risk assessment, liver, breast, and brain tumor categorization with minimal task-specific tuning. The method also presents robustness towards new domain adaptation. Besides radiology reports, the method has also been applied to other types of clinical notes (e.g., progress reports, discharge summary) for extracting information about patient-centered outcomes and distant cancer recurrence status. Our proposed methods outperform domain-specific rule-based systems that need a tremendous amount of hand-engineering in every domain that we tested. [slides] [recorded video]

30 mins. Interpretable deep learning model for multiple modal and cross-domain medical images. Yingying Zhu.

With the rapid growth in the number of imaging procedures, digitization of images, internet explosion, and healthcare's inexorable digital migration, we face the challenges of developing machine learning algorithms to analyze these multiple modal and different domain medical images efficiently and improve the current healthcare system workflow. In this session, we will present our current work on improving the generalization ability of deep learning models on medical images from different domains: Non-contrast CT, contrast-enhanced CT, MRI, chest X-ray using unsupervised image translation model for domain adaptation and data augmentation purpose. Secondly, we will present the recent work on combining multiple modal data: the labeled and unlabeled imaging data, clinical records, and clinical notes to improve current computer-aided diagnosis systems using semi-supervised graph models to reduce the imaging labeling expense and improve diagnosis performance. Since the interpretation of deep learning models on medical imaging problems is very important, we will present our work on developing interpretable chest x-ray diagnosis systems with adversary imaging decomposition and synthesis framework. [slides] [recorded video]

30 mins. Clinical NLP-powered data extraction on CXR and CT reports. Yifan Peng.

Medical imaging has been a common examination in daily clinical routine for screening and diagnosing a variety of diseases. Although hospitals have accumulated many image exams and associated reports, it is challenging to effectively build high precision computer-aided diagnosis systems. In this session, we will present an overview of cutting-edge techniques for mining existing free-text report data for assisting medical image analysis via natural language processing (NLP) and deep learning. Specifically, we will present (1) a method to text mine disease image labels (where each image can have multi-labels) from the associated radiological reports using NLP, and (2) a deep learning model to extract attributes (e.g., type, location, size) of lesions of interest from the clinical text. Taken together, we expect our approach will contribute to advancement in the understanding of the radiological world and enhancing clinical decision-making. [slides] [recorded video]

About the speakers

Subhi Al'Aref, M.D., Assistant Professor in the Division of Cardiology in the Department of Internal Medicine at the University of Arkansas for Medical Sciences College of Medicine. His main research interests include the investigation of the diagnostic and prognostic utility of noninvasive cardiovascular imaging modalities.

Imon Banerjee, Ph.D., Assistant Professor in the Department of Biomedical Informatics and Radiology at Emory University School of Medicine. Her core expertise is unstructured data analysis with deep learning and machine representation of image semantics. She published several manuscripts proposing novel methods for radiology text mining in top-tier journals and conferences.

Alistair Johnson, Ph.D., Scientist at the Hospital for Sick Children in Toronto, Canada. Alistair has extensive experience and expertise in working with clinical data, having published the MIMIC-III, MIMICIV, eICU-CRD datasets, and most recently MIMIC-CXR, a large publicly available dataset of chest x-rays.

Matthew Lungren, M.D., MPH, Co-Director of the Stanford Center for Artificial Intelligence in Medicine and Imaging and Associate Professor Clinician Scientist at Stanford University Medical Center. His NIH and NSF funded research are in the field of AI and deep learning in medical imaging, precision medicine, and predictive health outcomes.

Yifan Peng, Ph.D., Assistant Professor at the Department of Population Health Sciences at Weill Cornell Medicine. His main research interests include biomedical and clinical natural language processing and medical image analysis.

Yingying Zhu, Ph.D., Assistant Professor at the Department of Computer Science and Engineering at the University of Texas at Arlington. Her research lies in the intersection of machine learning, computer vision, medical imaging analysis, and bioinformatics. She has published many papers in top computer vision and medical imaging conferences and journals.

Please contact Yifan Peng if you have question. The webpage template is by the courtesy of awesome Georgia.