Heterogeneous LiDAR Early Fusion and Learned Re-Ranking Strategy for Robust Long-Term Place Recognition in Unstructured Environments
Judith Vilella-Cantos, Juan José Cabrera, Mónica Ballesta, David Valiente, Luis Payá
- Published
- Jun 11, 2026 — 15:52 UTC
Problem
The paper addresses the challenge of robust localization in unstructured environments, specifically in agricultural settings like vineyards, where traditional place recognition methods struggle due to repetitive features and varying environmental conditions. The authors highlight a gap in existing literature regarding the effective integration of heterogeneous LiDAR data from multiple sensors for improved place recognition. This work is presented as a preprint, indicating it has not yet undergone peer review.
Method
The core technical contribution is the MinkUNeXt-VINE++ architecture, which employs early fusion of heterogeneous LiDAR data from two distinct sensors: Livox Mid-360 and Velodyne VLP-16. This approach capitalizes on the complementary strengths of each sensor, enhancing the environmental representation. The architecture integrates a learned re-ranking strategy during inference, which is crucial for improving recognition accuracy in repetitive environments. The training process utilizes the TEMPO-VINE dataset, which consists of heterogeneous LiDAR data collected from vineyards at various phenological stages. The authors do not disclose specific training compute resources or hyperparameters, focusing instead on the architectural innovations.
Results
MinkUNeXt-VINE++ demonstrates significant performance improvements over baseline methods. Specifically, it achieves a 20% enhancement in the Recall@1 metric compared to single-sensor approaches. When incorporating the learned re-ranking strategy, the performance improves by an additional 30%. These results are benchmarked against state-of-the-art methods on the TEMPO-VINE dataset, underscoring the effectiveness of the proposed approach in challenging agricultural environments.
Limitations
The authors acknowledge that their method may be sensitive to the quality and calibration of the LiDAR sensors used, which could affect the robustness of the place recognition in highly variable environments. Additionally, the reliance on a specific dataset (TEMPO-VINE) may limit the generalizability of the findings to other unstructured environments. The paper does not address potential computational overhead introduced by the re-ranking strategy, which could impact real-time applications.
Why it matters
This research has significant implications for the development of autonomous systems operating in complex agricultural settings, where accurate localization is critical for tasks such as precision farming and automated harvesting. The integration of heterogeneous LiDAR data and a learned re-ranking strategy could pave the way for more robust and reliable place recognition systems. This work contributes to the ongoing discourse in the field of computer vision and robotics, particularly in enhancing the capabilities of autonomous systems in unstructured environments, as published in arXiv.
By Turing Wire editorial staff · Jun 11, 2026 · Editorial standards →
Source: arXiv cs.CV