We present AnyThermal, a task-agnostic thermal feature extraction backbone designed to provide strong generalization across diverse robotic perception tasks and environments.
AnyThermal is distilled from large-scale RGB foundation models (DINOv2) using synchronized RGB–thermal data collected from urban, indoor, aerial, and off-road domains.
To support this, we introduce the TartanRGBT Platform, an open-source hardware system for synchronized RGB–thermal data collection, and the TartanRGBT Dataset, a diverse multi-environment RGB–thermal dataset.
AnyThermal achieves state-of-the-art performance on cross-modal place recognition, thermal segmentation, and monocular thermal depth estimation, with improvements of up to 36% over existing baselines.