GitHub - olivesgatech/R3D: Countering Multi-modal Representation Collapse through Rank-targeted Fusion (WACV 2026)

Countering Multi-modal Representation Collapse through Rank-targeted Fusion (WACV 2026)

Links

Paper (arXiv): https://arxiv.org/abs/2511.06450

This repository provides the official code and experiments for “Countering Multi-modal Representation Collapse through Rank-targeted Fusion.”

Multi-modal fusion often suffers from two coupled failure modes:

Feature collapse: representation diversity shrinks as variation concentrates in only a few directions.
Modality collapse: one dominant modality overwhelms the other, reducing balanced multi-modal reasoning.

TL;DR

We propose effective rank as a unified measure to quantify and counter both collapses, and introduce Rank-enhancing Token Fuser, a theoretically grounded fusion method that selectively blends less-informative features from one modality with complementary features from another to increase the effective rank of the fused representation. To further address modality collapse, we analyze modality pairings and show that depth helps preserve representational balance when fused with RGB. We validate the approach on human action anticipation / action segmentation, demonstrating improvements across diverse datasets.

Training R3D

To run experiments for each dataset, execute the corresponding main script below:

DARai
```
python3 main_darai.py
```
UTKinects
```
python3 main_utkinects.py
```
NTURGBD
```
python3 main_nturgbd.py
```

Citation

@inproceedings{kim2026ranktargetedfusion,
  title     = {Countering Multi-modal Representation Collapse through Rank-targeted Fusion},
  author    = {Kim, Seulgi and Kokilepersaud, Kiran and Prabhushankar, Mohit and AlRegib, Ghassan},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
evaluation		evaluation
loss		loss
model		model
scripts		scripts
train		train
.gitignore		.gitignore
README.md		README.md
Untitled.ipynb		Untitled.ipynb
futr.yaml		futr.yaml
main.py		main.py
main_baseline.py		main_baseline.py
main_darai.py		main_darai.py
main_nturgbd.py		main_nturgbd.py
main_proposed.py		main_proposed.py
main_proposed_50salads.py		main_proposed_50salads.py
main_utkinects.py		main_utkinects.py
opts.py		opts.py
pipeline.png		pipeline.png
predict.py		predict.py
run.sh		run.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Countering Multi-modal Representation Collapse through Rank-targeted Fusion (WACV 2026)

Links

TL;DR

Training R3D

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Countering Multi-modal Representation Collapse through Rank-targeted Fusion (WACV 2026)

Links

TL;DR

Training R3D

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages