BU-led interdisciplinary research team is developing the first anatomy-aware GenAI model for lung abnormalities

Kayhan Batmanghelich, with collaborators, published a paper that uses anatomy-aware generative AI and explainable AI methods to improve synethsizing 3D images with specific abnormalities

By Maureen Stanton

In 2021, OpenAI launched a GenAI model called DALL-E that captivated the world with its ability to generate an image instantly with a simple text caption. Now GenAI models are poised to transform virtually every industry, particularly healthcare.

AI-powered image generators are typically built on foundational models. Using machine learning algorithms, these models train on large data sets to analyze images and patterns, and then reconstruct them. In medical imaging, AI-powered imaging models can be used to detect complex images in x-rays, helping physicians efficiently diagnose disorders and track diseases.

Despite the promise of GenAI, the technology can create so-called “hallucinations,” a term coined to describe false, biased, or misleading information generated by AI. In AI-driven medical imaging, hallucinations can lead to incorrect diagnoses, delayed treatment, and impact patient outcomes. Nevertheless, the FDA has already approved over 400 AI algorithms that can scan for various diseases with an 80% to 90% accuracy rate. But is a 20% error rate acceptable for medical diagnostics that could have potentially life-altering consequences?

A Boston University-led interdisciplinary team of researchers doesn’t think so.

In a recent paper published in IEEE Transactions on Medical Imaging, the researchers present a unique approach to AI-powered medical imaging: the first anatomy-aware GenAI model shown to efficiently produce highly accurate volumetric (3D) chest CAT scan images using text prompts.

The model, called MedSyn, holds promise as a potential tool for diverse clinical and research applications. In clinical settings, MedSyn can be used to interpret complex abnormalities and detect diseases with greater accuracy and efficiency. MedSyn’s vast datasets can serve as an invaluable data augmentation resource for clinical researchers in need of large, hard-to-acquire sample sizes. Importantly, MedSyn can be used for “ExplainableAI”, helping clinicians and researchers understand machine learning better, audit deep learning models conditioned on pathology (disease), and overcome trust barriers in AI-driven healthcare.

Taking an organ-specific, multimodal approach

“While the trend for foundational models has taken a one-size-fits-all approach, we don’t believe that is the best practice for medical imaging,” says corresponding author Kayhan Batmanghelich, an assistant professor of engineering and junior faculty fellow of Hariri Institute for Computing at Boston University. ““Pathological changes to anatomy can occur years before clinical evidence of disease. For interventions to have the most impact, we need sophisticated AI models that can detect abnormalities that are often elusive in the earliest stages of disease progression.”

Unlike general purpose foundational models, MedSyn takes an organ-specific, multimodal approach. The model’s algorithms are trained on a large data set of over 9,000 3D chest CAT scans paired with over 200,000 deanonymized radiology reports. By learning detailed information reported by radiologists that include specific pathology, its extent, anatomical location, etc., the model provides more refined control over the generation process.

“Lung CT Scans exhibit more challenging details compared to other organs,” says Batmanghelich. ““By specifically fine-tuning the AI language model to the language used in radiology reports and incorporating constraints that enforce standard anatomical positions consistent with the human body, our model can synthesize images with a level of granularity, even when presented with small, hard-to-see anatomical details.”

Controlled volume synthesis via the anatomical priors. The first column shows the anatomical mask used as the condition. The second column shows the corresponding real images. The remaining columns show samples of conditionally generated images. The results show that the generated images can preserve the conditioning anatomical structures.

Explainable AI helps overcome trust concerns

While artificial intelligence has shown its potential to revolutionize healthcare, many clinicians are hesitant to embrace AI. AI-based healthcare systems typically have limited training data and lack of transparency, making AI-generated diagnoses or recommendations difficult to trust. Explainable AI helps overcome trust concerns by giving clinicians the ability to understand the reasoning behind AI-generated decisions.

“Since the data set for MedSyn is derived from CAT scans and radiology reports, the output is fully transparent and as interpretable as a radiology report,” says Batmanghelich. “The tool not only empowers clinicians but researchers can use this method as a building block in their research.

“Our research takes a significant step toward demonstrating the capability and potential impact of building anatomically-specific foundational models, models that know about anatomy and know about the macros and micros structure that is changing the anatomy. These next generation models deliver the accuracy, reliability and interpretability that clinicians and researchers require, and that the healthcare sector requires, for AI to realize its transformative potential.”

This computationally-intensive research was made possible through Bridges-2, a joint center of the University of Pittsburgh and Carnegie Mellon University designed for converged high-performance computing (HPC), high performance artificial intelligence (HPAI), and large-scale data management.

The researchers have made the code and pre-trained model publicly available at https://github.com/batmanlab/MedSyn

This work was funded in part by NSF (Grant Number: 1839332 Tripod+X), NIH (Grant Number: 1R01HL141813-01), and the Pittsburgh Super Computing (Grant Number: TG-ASC170024).

View the publication: Xu Y, Sun L, Peng W, Jia S, Morrison K, Perer A, Zandifar A, Visweswaran S, Eslami M, Batmanghelich K. MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images. IEEE Trans Med Imaging. 2024 Jun 20;PP. doi: 10.1109/TMI.2024.3415032. Epub ahead of print. PMID: 38900619.

Study researchers

Yanwu Xu, PhD student, Department of Electrical and Computer Engineering, Boston University
Li Sun, PhD candidate, Department of Electrical and Computer Engineering, Boston University
Wei Peng, Postdoctoral Researcher, Department of Psychiatry and Behavioral Sciences, Stanford University
Shuyue Jia, PhD student, Department of Electrical and Computer Engineering, Boston University
Katelyn Morrison, PhD student, Human-Computer Interaction Institute, Carnegie Mellon University
Adam Perer, Assistant Professor, Human-Computer Interaction Institute, Carnegie Mellon University
Afrooz Zandifar, Radiologist, University of Pittsburgh Medical Center
Shyam Visweswaran, Professor, Department of Biomedical Informatics, University of Pittsburgh
Motahhare Eslami, Assistant Professor, Human-Computer Interaction Institute, Carnegie Mellon University
Kayhan Batmanghelich, Assistant Professor, Department of Electrical and Computer Engineering, Boston University