Joda

Joint Out-of-Distribution Filtering and Data Discovery Active Learning

🎯 CVPR 2025

Overview

Real-world data is messy — it contains out-of-distribution noise and novel categories waiting to be discovered. Previous active learning methods tackle these challenges separately, but Joda is the first to solve both at the same time.

Joint Out-of-distribution filtering and data Discovery Active learning (Joda) filters out OOD data before selecting the most valuable candidates for labeling, all within a deeply entangled training pipeline that builds a common feature space aligning known and novel categories while separating OOD noise.

Key Contributions

  • 🔗 Joint framework — first method to combine OOD filtering and novel category discovery in active learning
  • âš¡ Highly efficient — no auxiliary models, no training access to the unlabeled pool needed
  • 📊 Extensive evaluation — tested across 18 configurations and 3 metrics, consistently achieving the best accuracy and the strongest class discovery-to-OOD filtering balance
  • 🥇 Outperforms all state-of-the-art competitor approaches

Why It Matters

Labeling data is expensive. Joda makes every label count by intelligently filtering noise and discovering new categories — bringing active learning one step closer to practical, real-world deployment.

(Schmidt et al., 2025)

References

2025

  1. Joint Out-of-Distribution Filtering and Data Discovery Active Learning
    Sebastian Schmidt, Leonard Schenk, Leo Schwinn, and 1 more author
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Oct 2025