Héctor Carrión*
·
Yutong Bai*
·
Víctor A. Hernández Castro*
Kishan Panaganti
·
Matthew Trang
·
Ayush Zenith
·
Tony Zhang
·
Pietro Perona
·
Jitendra Malik
(* equal contribution)
Check your system's CUDA version with nvcc
nvcc --versionCreate and activate virtual environment with required Python dependencies:
conda env create -f gpu_environment.yml tardis
conda activate tardisAnother approach is to build from our Dockerfile:
docker build -f Dockerfile --platform=linux/amd64 -t tardis .The full tokenized dataset is made available through two downloadable files in a public GCS bucket:
gsutil -m cp gs://tera-tardis/STRIDE-1/training.jsonl . # ~327GB
gsutil -m cp gs://tera-tardis/STRIDE-1/testing.jsonl . # ~9GBThe checkpoint/state used for evaluation of the model was saved in MessagePack format and is made available through this downloadable file:
gsutil -m cp gs://tera-tardis/STRIDE-1/checkpoint.msgpack . # ~10GBTo train on a single VM, you may use this script:
EasyLM/scripts/train.shTo train using Kubernetes, submit the Kubernetes Job as stated in .kubernetes/setup-cluster.sh.
We only provide evaluation code for single VM configuration, as supposed to distributed solutions.
gsutil -m cp -r cp gs://tera-tardis/STRIDE-1/checkpoint.msgpack .
python -m EasyLM.models.llama.convert_easylm_to_hf \
--load_checkpoint='trainstate_params::checkpoint.msgpack' \
--model_size='vqlm_1b' \
--output_dir='.'For a more detailed breakdown of eval, please see this notebook
The dataset itself consists of Google StreetView data which has been thoroughly cleansed and blurred to protect the privacy of citizens, and is free of any ill-intent, nudity and sensitive information. For more information, refer to their policy.
- Héctor Carrión
- Yutong Bai
- Víctor A. Hernández Castro
- Kishan Panaganti
- Ayush Zenith
- Matthew Trang
- Tony Zhang
If you found this code/work to be useful in your own research, please consider citing as follows:
@article{carrion2025_tardis_stride,
title={{TARDIS STRIDE}: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy},
author={Héctor Carrión, Yutong Bai, Víctor A. Hernández Castro, Kishan Panaganti, Ayush Zenith, Matthew Trang, Tony Zhang, Pietro Perona, Jitendra Malik},
journal={arXiv preprint},
year={2025},
}