CV |

Basics

Name	Sebatian Schmidt
Label	Scientist
Email	sebastian95 [at] tum [dot] de
Url	https://basti-schmidt.eu/
Summary	I am a researcher and developer focused on computer vision, AI and autonomous driving with several top tier publications and 8 years industry and autonomous driving experience and 2 years of startup experience. In various roles I gained expertise in data engineering, perception, machine learning, generative AI and project management.

Education

2022.04 - 2025.09

Adelaide, Australia
PhD

University of Adelaide

Computer Science
2022.04 - 2025.09

Munich, Germany
PhD

Technical University of Munich

Machine Learning Computer Science
2017.04 - 2019.11

Munich, Germany
Master of Science

Technical University of Munich

Mechatronics and Information Technology

Publications

2026.08.01

Scalable Object Detection in the Car Interior With Vision Foundation Models

IEEE Intelligent Vehicles Symposium (IV)

We propose the ODAL framework for car interior scene understanding using vision foundation models via a distributed on-board/cloud architecture. Our fine-tuned ODAL-LLaVA achieves 89% ODAL score, outperforming GPT-4o by nearly 20%.
2026.01.01

Amplified Patch-Level Differential Privacy for Free via Random Cropping

Transactions on Machine Learning Research (TMLR)

We show that random cropping, a standard data augmentation, naturally amplifies patch-level differential privacy — providing stronger privacy guarantees for free without additional computation or architectural changes.
2025.11.01

Unexplored Flaws in Multiple-Choice VQA Evaluations

ArXiv

A large-scale study across 7 MLLMs, 5 VQA datasets, and 48 prompt format variations revealing that multiple-choice VQA benchmarks are highly sensitive to semantically neutral prompt formatting changes — biases that existing mitigations fail to address.
2025.10.01

A Machine Learning Perspective on Automated Driving Corner Cases

ArXiv

A distributional perspective on corner cases that unifies existing taxonomies, achieves strong OOD detection benchmark performance, and enables analysis of combined corner cases via a new fog-augmented Lost & Found dataset.
2025.10.01

GeoDiffusion: A Training-Free Framework for Accurate 3D Geometric Conditioning in Image Generation

IEEE/CVF International Conference on Computer Vision (ICCV)

A training-free framework that enables precise 3D geometric conditioning in diffusion-based image generation without any fine-tuning, maintaining high image quality while enforcing geometric constraints.
2025.10.01

Prior2Former - Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation

IEEE/CVF International Conference on Computer Vision (ICCV) — Highlight

The first evidential learning approach for segmentation vision transformers. P2F incorporates a Beta prior for pixel-wise uncertainty, enabling state-of-the-art anomaly instance and open-world panoptic segmentation without access to OOD data.
2025.06.01

Effective Data Pruning through Score Extrapolation

ArXiv

A score extrapolation framework using kNN and GNNs to predict sample importance for the entire dataset from a small training subset, enabling effective data pruning across supervised, unsupervised, and adversarial paradigms.
2025.06.01

Joint Out-of-Distribution Filtering and Data Discovery Active Learning

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Joda is the first method to jointly address OOD filtering and novel category discovery in active learning. Highly efficient with no auxiliary models needed, it consistently achieves the best accuracy across 18 configurations.
2025.01.01

A Unified Approach Towards Active Learning and Out-of-Distribution Detection

Transactions on Machine Learning Research (TMLR)

A unified framework that bridges active learning and OOD detection, enabling principled sample selection that simultaneously identifies informative in-distribution samples and detects distributional shifts.
2024.10.01

Deep Sensor Fusion with Constraint Safety Bounds for High Precision Localization

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

A deep sensor fusion approach embedding constraint safety bounds directly into the learning pipeline, achieving high-precision localization with formal safety guarantees for autonomous driving.
2024.10.01

Generalized Synchronized Active Learning for Multi-Agent-Based Data Selection on Mobile Robotic Systems

IEEE Robotics and Automation Letters (RA-L)

A synchronized active learning framework that coordinates data selection across multiple mobile robotic agents, exploiting spatial and temporal correlations to enable diverse, non-redundant data collection for robot fleets.
2023.11.01

Stream-based Active Learning by Exploiting Temporal Properties in Perception with Temporal Predicted Loss

British Machine Vision Conference (BMVC)

A stream-based active learning method leveraging temporal properties of perception data to make real-time labeling decisions on live data streams without requiring a stored data pool.
2020.06.01

Advanced Active Learning Strategies for Object Detection

IEEE Intelligent Vehicles Symposium (IV)

Novel ensemble-based uncertainty estimation for 2D and 3D object detection with active learning, achieving ~55% time savings, ~30% data savings, and 35% labeling effort reduction for automotive use cases.

Skills

	Machine Learning & AI
	Deep Learning
	Active Learning
	Out-of-Distribution Detection
	Uncertainty Estimation
	Computer Vision
	Generative AI

	Programming & Frameworks
	Python
	PyTorch
	TensorFlow
	C#
	Scala
	PySpark

	Infrastructure & DevOps
	Docker
	Kubernetes
	Azure
	AWS
	Airflow
	GitHub Actions

Languages

	German
	Native speaker

	English
	Fluent

Projects

2018.01 - 2019.01
TUM Phoenix Robotics

Student robotics team at TUM developing autonomous systems for international robotics competitions, contributing to the perception and navigation pipeline.
- Autonomous Navigation
- Perception Pipeline

Basics

Education

University of Adelaide

Computer Science

Technical University of Munich

Machine Learning Computer Science

Technical University of Munich

Mechatronics and Information Technology

Publications

IEEE Intelligent Vehicles Symposium (IV)

We propose the ODAL framework for car interior scene understanding using vision foundation models via a distributed on-board/cloud architecture. Our fine-tuned ODAL-LLaVA achieves 89% ODAL score, outperforming GPT-4o by nearly 20%.

Transactions on Machine Learning Research (TMLR)

We show that random cropping, a standard data augmentation, naturally amplifies patch-level differential privacy — providing stronger privacy guarantees for free without additional computation or architectural changes.

ArXiv

A large-scale study across 7 MLLMs, 5 VQA datasets, and 48 prompt format variations revealing that multiple-choice VQA benchmarks are highly sensitive to semantically neutral prompt formatting changes — biases that existing mitigations fail to address.

ArXiv

A distributional perspective on corner cases that unifies existing taxonomies, achieves strong OOD detection benchmark performance, and enables analysis of combined corner cases via a new fog-augmented Lost & Found dataset.

IEEE/CVF International Conference on Computer Vision (ICCV)

A training-free framework that enables precise 3D geometric conditioning in diffusion-based image generation without any fine-tuning, maintaining high image quality while enforcing geometric constraints.

IEEE/CVF International Conference on Computer Vision (ICCV) — Highlight

The first evidential learning approach for segmentation vision transformers. P2F incorporates a Beta prior for pixel-wise uncertainty, enabling state-of-the-art anomaly instance and open-world panoptic segmentation without access to OOD data.

ArXiv

A score extrapolation framework using kNN and GNNs to predict sample importance for the entire dataset from a small training subset, enabling effective data pruning across supervised, unsupervised, and adversarial paradigms.

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Joda is the first method to jointly address OOD filtering and novel category discovery in active learning. Highly efficient with no auxiliary models needed, it consistently achieves the best accuracy across 18 configurations.

Transactions on Machine Learning Research (TMLR)

A unified framework that bridges active learning and OOD detection, enabling principled sample selection that simultaneously identifies informative in-distribution samples and detects distributional shifts.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

A deep sensor fusion approach embedding constraint safety bounds directly into the learning pipeline, achieving high-precision localization with formal safety guarantees for autonomous driving.

IEEE Robotics and Automation Letters (RA-L)

A synchronized active learning framework that coordinates data selection across multiple mobile robotic agents, exploiting spatial and temporal correlations to enable diverse, non-redundant data collection for robot fleets.

British Machine Vision Conference (BMVC)

A stream-based active learning method leveraging temporal properties of perception data to make real-time labeling decisions on live data streams without requiring a stored data pool.

IEEE Intelligent Vehicles Symposium (IV)

Novel ensemble-based uncertainty estimation for 2D and 3D object detection with active learning, achieving ~55% time savings, ~30% data savings, and 35% labeling effort reduction for automotive use cases.

Skills

Languages

Projects

Student robotics team at TUM developing autonomous systems for international robotics competitions, contributing to the perception and navigation pipeline.