Authors: xxx
Efficient 3D perception is critical for autonomous systems like self-driving vehicles and drones to operate safely in dynamic environments. Accurate 3D object detection from LiDAR data faces challenges due to the irregularity and high volume of point clouds, inference latency variability from contention and content dependence, and embedded hardware constraints. Balancing accuracy and latency under dynamical conditions is crucial, yet existing frameworks like Chanakya [NeurIPS ’23], LiteReconfig [EuroSys ’22], and AdaScale [MLSys ’19] struggle with the unique demands of 3D detection. We present Agile3D, the first adaptive 3D system to integrate a cross-model Multi-branch Execution Framework (MEF) and a Contention- and Content-Aware RL-based controller (CARL). CARL dynamically selects the optimal execution branch using five novel MEF control knobs: partitioning format, spatial resolution, spatial encoding, 3D feature extractors, and detection heads. CARL employs a dual-stage optimization strategy: Supervised pretraining for robust initial learning and Direct Preference Optimization (DPO) for fine-tuning without manually tuned rewards, inspired by techniques for training large language models. Comprehensive evaluations show that Agile3D achieves state-of-the-art performance, maintaining high accuracy across varying hardware contention levels and latency budgets of 100-500 ms. On NVIDIA Orin and Xavier GPUs, it consistently leads the Pareto frontier, outperforming existing methods for robust, efficient 3D object detection.
We provide both an embedded GPU and a desktop GPU for evaluation. Access details are below.
Due to university IT security restrictions, we use ZeroTier to provide external access. Install ZeroTier on your local computer: https://www.zerotier.com/download/
Join our ZeroTier network with ID: a09acf02337ca32e
We will authorize your device within 24 hours.
# SSH into the NVIDIA Jetson Orin (embedded GPU)
ssh -i mobisys2025.pem agile3d@172.30.53.226
# SSH into the desktop GPU for evaluation
ssh -i mobisys2025.pem agile3d@172.30.166.233
# Access the Docker environment (already running on both GPUs)
docker exec -it mobisys2025 /bin/bashIf you use our provided systems, you can use the pre-configured conda environment:
conda activate agile3dFor installation on your own systems, please refer to our installation guides:
[40 human-minutes + 12 compute-hours]
This experiment evaluates Agile3D's accuracy and latency performance on the NVIDIA Jetson Orin under different contention levels. The results correspond to the values presented in Fig. 7 [Left] CARL + MEF and Sec. 5.2.
Expected performance:
- Contention level 1: 71.72% accuracy, 362 ms latency
- Contention level 2: 70.98% accuracy, 415 ms latency
- Contention level 3: 70.03% accuracy, 468 ms latency
- Contention level 4: 68.72% accuracy, 476 ms latency
# We've already placed a copy of results on the server for your convenience
# If you want to run the experiment yourself on the GPU server:
# SSH into the GPU server and activate environment
ssh -i mobisys2025.pem agile3d@172.30.166.233
docker exec -it mobisys2025 /bin/bash
conda activate agile3d
cd /home/data/agile3d
# Run experiment_1
bash experiment_1.sh
# Evaluate experiment_1 results
bash eval_experiment_1.sh
# ### Recommended starting point ###
# The above experiments will take a long time, we have already put all the results on the server
# For fast evaluation
bash eval_experiment_1_short.sh[20 human-minutes + 3 compute-hours]
This experiment measures the switching overhead when transitioning between different Agile3D branches.
Expected results: Mean switching overhead < 2 ms. The results correspond to the values presented in Fig. 10 and Sec. 5.7.
# SSH into Orin and activate environment
ssh -i mobisys2025.pem agile3d@172.30.53.226
docker exec -it mobisys2025 /bin/bash
conda activate agile3d
cd /home/data/agile3d
# Run experiment_2
bash experiment_2.sh[60 human-minutes + 24 compute-hours]
This experiment compares Agile3D against static state-of-the-art models in terms of latency and accuracy. The results correspond to the values presented in Fig. 11 and Sec. 5.4.
Expected latency and accuracy (L2 mAP):
- Agile3D: 85-360 ms, 63.71-71.73%
- PV-RCNN: 850 ms, 64.4%
- DSVT-Voxel: 450 ms, 71.7%
- DSVT-Pillar: 310 ms, 70.91%
- PartA2: 390 ms, 64.82%
- SECOND: 105 ms, 58.62%
- PointPillar: 130 ms, 59.15%
- CenterPoint-Voxel: 115 ms, 64.5%
- CenterPoint-Pillar: 170 ms, 63.12%
# We've already placed a copy of results on the server for your convenience
# If you want to run the experiment yourself on Orin:
# SSH into Orin and activate environment
ssh -i mobisys2025.pem agile3d@172.30.53.226
docker exec -it mobisys2025 /bin/bash
conda activate agile3d
cd /home/data/agile3d
# Run experiment_3
bash experiment_3.sh
# For evaluation:
# SSH into GPU server and activate environment
ssh -i mobisys2025.pem agile3d@172.30.166.233
docker exec -it mobisys2025 /bin/bash
conda activate agile3d
cd /home/data/agile3d
# Evaluate experiment_3 results
bash eval_experiment_3.sh
# ### Recommended starting point ###
# The above experiments will take a long time, we have already put all the results on the server
# For fast evaluation
bash eval_experiment_3_short.shAgile3D is released under the CC BY-NC-ND 4.0 license.
If you find this project useful in your research, please consider citing:
@misc{agil3d,
title={Agile3D: Adaptive Contention- and Content-Aware 3D Object Detection for Embedded GPUs},
author={Wang, Pengcheng and Liu, Zhuoming and Bagchi, Shayok and Xu, Ran and Bagchi, Saurabh and Li, Yin and Chaterji, Somali},
howpublished = {The 23rd ACM International Conference on Mobile Systems, Applications, and Services},
year={2025}
}