🔑 CrossKEY
A framework for learning 3D Cross-modal Keypoint Descriptor for MR-US Matching and Registration
Daniil Morozov1,2 · Reuben Dorent3,4 · Nazim Haouchine2
1 Technical University of Munich (TUM), 2 Harvard Medical School, 3 Inria Saclay, 4 Sorbonne Université, Paris Brain Institute (ICM)
CrossKEY enables robust 3D keypoint matching between MRI and iUS, achieving state-of-the-art performance both in image matching and registration tasks
- Essential Scripts: Add training and testing scripts with test data example
- Interactive Demo: Create Colab notebook for easy experimentation
- Visualization Functions: Add utilities for keypoint and matching visualization
Intraoperative registration of real-time ultrasound (iUS) to preoperative Magnetic Resonance Imaging (MRI) remains an unsolved problem due to severe modality-specific differences in appearance, resolution, and field-of-view. To address this, we propose a novel 3D cross-modal keypoint descriptor for MRI–iUS matching and registration. Our approach employs a patient-specific matching-by-synthesis approach, generating synthetic iUS volumes from preoperative MRI. This enables supervised contrastive training to learn a shared descriptor space. A probabilistic keypoint detection strategy is then employed to identify anatomically salient and modality-consistent locations. During training, a curriculum-based triplet loss with dynamic hard negative mining is used to learn descriptors that are i) robust to iUS artifacts such as speckle noise and limited coverage, and ii) rotation-invariant. At inference, the method detects keypoints in MR and real iUS images and identifies sparse matches, which are then used to perform rigid registration. Our approach is evaluated using 3D MRI-iUS pairs from the ReMIND dataset. Experiments show that our approach outperforms state-of-the-art keypoint matching methods across 11 patients, with an average precision of 69.8%. For image registration, our method achieves a competitive mean Target Registration Error of 2.39 mm on the ReMIND2Reg benchmark.
Overview of our CrossKEY framework
- Python ≥ 3.12
- Poetry for dependency management
- Ubuntu/Linux (for SIFT3D compilation)
- Clone the repository:
git clone https://github.com/morozovdd/CrossKEY.git
cd CrossKEY- Run the setup script:
./setup.shThis will:
- Set up Python environment with Poetry
- Install dependencies
- Compile external libraries (SIFT3D)
- Create necessary directories
- Start training:
poetry shell
python example_train.pyThe training script automatically generates required preprocessing data (SIFT descriptors and heatmaps) on first run.
poetry run python example_train.pyThe training script will:
- Automatically generate SIFT descriptors if missing
- Create keypoint heatmaps if missing
- Train the CrossKEY descriptor model
- Save checkpoints to
logs/
poetry run python example_test.pyRequires a trained model checkpoint. Update the checkpoint path in configs/test_config.yaml:
model:
checkpoint_path: "path/to/your/checkpoint.ckpt"Modify training parameters in configs/train_config.yaml:
- Model architecture settings
- Loss function parameters
- Training hyperparameters
- Data augmentation options
The repository includes test data from Case059:
- MR images: T2-weighted brain MRI
- US images: Real intraoperative ultrasound
- Synthetic US: Generated from MR using synthesis pipeline
- SIFT descriptors: 3D keypoint features for training
- Heatmaps: Probabilistic keypoint detection maps
To train CrossKEY with your own medical imaging data:
-
Prepare your data structure:
data/img/ ├── mr/ # Place your MR images here (.nii.gz) ├── us/ # Place real US images here (.nii.gz) └── synthetic_us/ # Place synthetic US images here (.nii.gz) -
Data requirements:
- MR images: 3D T1/T2 weighted brain MRI in NIfTI format (.nii.gz)
- Synthetic US: Generated from MR using US image synthesizer (required for training)
- Real US: Optional for testing; 3D intraoperative ultrasound volume
-
Start training:
poetry run python example_train.py
The system will automatically generate SIFT descriptors and heatmaps for your data.
Note: For optimal results, ensure synthetic US images are generated using a realistic ultrasound synthesis pipeline that preserves anatomical correspondences with the source MR images.
- Automatic preprocessing: SIFT extraction and heatmap generation
- Cross-modal learning: MR-US descriptor matching
- Curriculum training: Progressive hard negative mining
- Rotation invariance: Robust to orientation changes
- Patient-specific: Synthesis-based training approach
This project is licensed under the MIT License - see the LICENSE file for details.
If you find this work useful for your research, please consider citing:
@article{morozov20253dcrossmodalkeypointdescriptor,
title={A 3D Cross-modal Keypoint Descriptor for MR-US Matching and Registration},
author={Daniil Morozov and Reuben Dorent and Nazim Haouchine},
year={2025},
eprint={2507.18551},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.18551},
}