This is the official code for the paper: Efficient medical vision-language alignment through adapting masked vision models (TMI 2025).
OS: Ubuntu 20.04 LTS.
Language: Python 3.10.8
If you are using conda, we provide an easy way to continue:
conda env create -f environment.yaml
pip install -r requirements.txt
- We use MIMIC-CXR-JPG for pre-training. You can acquire more information about this dataset at Johnson et al. MIMIC-CXR-JPG.
- The dataset directory specified in run.sh includes the MIMIC-CXR-JPG dataset and you need to prepare files "train.csv" according to the paper, then put them into the dateset directory MIMIC-CXR_dataset.
- The file "train.csv" includes many columns for each line, including: image_path, auxview_image_path, last_image_path, last_auxview_image_path, report, which stands for the path of current frontal image, current lateral image, prior frontal image, prior lateral image, and the content of report, respectively.
- Besides, the validation set of RSNA Pneumonia dataset is used for validation, please put the dataset into the directory of RSNA_dataset. The dataset can be downloaded from https://www.kaggle.com/competitions/rsna-pneumonia-detection-challenge,
-
Get pre-trained weights of MRM and put the file into vision_encoder_weights.
-
Get pre-trained language model from BiomedVLP-CXR-BERT-specialized and put the files into the current directory.
-
Set the data path, GPU IDs, batch size, output directory, and other parameters in run.sh.
-
Start training by running
chmod a+x run.sh ./run.sh
Here we provide the trained weights of ALTA, you can download it from Google Drive and put it into the directory of ALTA_weights.
-
Prepare the dataset following convirt and put the directories of "image-retrieval" and "text-retrieval" into CheXpert8X200_dataset.
-
Run
python CheXpert8X200_img2img.py
-
The dataset has been prepared in 5.1.
-
Run
python CheXpert8X200_img2img.py
-
We have generated chexpert_5x200.csv by the codebase of gloria
-
Run
python CheXpert5X200_retrieval.py
-
The dataset has been prepared in 5.3.
-
Run
python CheXpert5X200_zeroshot.py
-
The dataset has been prepared in 2 Data preparation.
-
Run
python RSNA_zeroshot.py
Some code of this repository is borrowed from MAE, MRM, AIM, GLoRIA and huggingface.
This project is under the CC-BY-NC 4.0 license. See LICENSE for details.
