ODAL

Scalable Object Detection in the Car Interior With Vision Foundation Models

🚘 IEEE Intelligent Vehicles Symposium (IV) 2026

Overview

Can your car’s personal assistant identify what you left on the back seat? Interior object detection is critical for next-gen vehicle intelligence, but on-board hardware is too constrained to run modern foundation models.

We propose ODAL (Object Detection and Localization), a framework that leverages vision foundation models through a distributed on-board/cloud architecture — bringing the power of large-scale models to the car interior without exceeding hardware limits.

Key Results

  • 🏆 Fine-tuned ODAL-LLaVA achieves an ODAL score of 89%
  • 📈 71% improvement over baseline LLaVA 1.5 7B performance
  • 💪 Outperforms GPT-4o by nearly 20%
  • 🔇 3× higher signal-to-noise ratio than GPT-4o — significantly fewer hallucinations

Key Contributions

  • 🏗️ Distributed architecture — splits computation between on-board and cloud to overcome resource constraints
  • 📏 ODALbench — a new comprehensive metric for evaluating detection and localization quality
  • 🔬 Foundation model comparison — systematic evaluation of GPT-4o vs. lightweight LLaVA models
  • Fine-tuning wins — demonstrates that a small, fine-tuned model can decisively beat a much larger general-purpose one

Why It Matters

This work shows that you don’t need the biggest model to get the best results. With smart fine-tuning and a distributed architecture, lightweight models can outperform GPT-4o for targeted automotive applications.

(Schmidt et al., 2026)

References

2026

  1. Scalable Object Detection in the Car Interior With Vision Foundation Models
    Sebastian Schmidt, Bálint Mészáros, Ahmet Firintepe, and 1 more author
    In Proceedings of the IEEE Intelligent Vehicles Symposium (IV), Oct 2026