Simplifying Traffic Anomaly Detection with Video Foundation Models

Svetlana Orlova, Tommie Kerssies, Brunó B. Englert, Gijs Dubbelman
Eindhoven University of Technology

Recent methods for ego-centric Traffic Anomaly Detection (TAD) often rely on complex multi-stage or multi-representation fusion architectures, yet it remains unclear whether such complexity is necessary. Recent findings in visual perception suggest that foundation models, enabled by advanced pre-training, allow simple yet flexible architectures to outperform specialized designs. Therefore, in this work, we investigate an architecturally simple encoder-only approach using plain Video Vision Transformers (Video ViTs) and study how pre-training enables strong TAD performance. We find that: (i) advanced pre-training enables simple encoder-only models to match or even surpass the performance of specialized state-of-the-art TAD methods, while also being significantly more efficient; (ii) although weakly- and fully-supervised pre-training are advantageous on standard benchmarks, we find them less effective for TAD. Instead, self-supervised Masked Video Modeling (MVM) provides the strongest signal; and (iii) Domain-Adaptive Pre-Training (DAPT) on unlabeled driving videos further improves downstream performance, without requiring anomalous examples. Our findings highlight the importance of pre-training and show that effective, efficient, and scalable TAD models can be built with minimal architectural complexity.

✨ DoTA and DADA-2000 results

Video ViT-based encoder-only models set a new state of the art on both datasets, while being significantly more efficient than top-performing specialized methods. FPS measured using NVIDIA A100 MIG, 2 1 GPU. † From prior work. ‡ Optimistic estimates using publicly available components of the model. “A→B”: trained on A, tested on B; D2K: DADA-2000.

📍Model Zoo

We provide pre-trained and fine-tuned models in MODEL_ZOO.md.

🔨 Installation

Please follow the instructions in INSTALL.md.

🗄️ Data Preparation

Please follow the instructions in DATASET.md for data preparation.

🔄 Domain-Adaptive Pre-training (DAPT) and Fine-tuning

Please follow the instructions in TRAIN.md.

🚀 Evaluation and Inference

Instructions are in RUN.md.

☎️ Contact

Svetlana Orlova: [email protected], [email protected]

👍 Acknowledgements

Our code is mainly based on the VideoMAE codebase. With Video ViTs that have identical architecture, we only used their weights: ViViT, VideoMAE2, SMILE, SIGMA, MME, MGMAE.
We used fragments of original implementations of MVD, InternVideo2, and UMT to integrate these models with our codebase.

🔒 License

The majority of this project is released under the CC-BY-NC 4.0 license as found in the LICENSE file. Portions of the project are available under separate license terms: ViViT, InternVideo2, SlowFast and pytorch-image-models are licensed under the Apache 2.0 license. VideoMAE2, SMILE, MGMAE, UMT, and BEiT are licensed under the MIT license. SIGMA is licensed under the BSD 3-Clause Clear license

✏️ Citation

If you think this project is helpful, please feel free to leave a star⭐️ and cite our paper:

@inproceedings{orlova2025simplifying,
  title={Simplifying Traffic Anomaly Detection with Video Foundation Models},
  author={Orlova, Svetlana and Kerssies, Tommie and Englert, Brun{\'o} B and Dubbelman, Gijs},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2025}
}

@article{orlova2025simplifying,
  title={Simplifying Traffic Anomaly Detection with Video Foundation Models},
  author={Orlova, Svetlana and Kerssies, Tommie and Englert, Brun{\'o} B and Dubbelman, Gijs},
  journal={arXiv preprint arXiv:2507.09338},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
anaysis		anaysis
data_tools		data_tools
dataset		dataset
figs		figs
jobs		jobs
other_models		other_models
DATASET.md		DATASET.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
MODEL_ZOO.md		MODEL_ZOO.md
NOTICE.md		NOTICE.md
README.md		README.md
RUN.md		RUN.md
TRAIN.md		TRAIN.md
bdd100k.py		bdd100k.py
dada.py		dada.py
datasets.py		datasets.py
datasets_frame.py		datasets_frame.py
dota.py		dota.py
engine_for_finetuning.py		engine_for_finetuning.py
engine_for_frame_finetuning.py		engine_for_frame_finetuning.py
engine_for_pretraining.py		engine_for_pretraining.py
flash_attention_class.py		flash_attention_class.py
functional.py		functional.py
kinetics.py		kinetics.py
masking_generator.py		masking_generator.py
mixup.py		mixup.py
modeling_finetune.py		modeling_finetune.py
modeling_pretrain.py		modeling_pretrain.py
optim_factory.py		optim_factory.py
rand_augment.py		rand_augment.py
random_erasing.py		random_erasing.py
run_class_finetuning.py		run_class_finetuning.py
run_frame_finetuning.py		run_frame_finetuning.py
run_inference.py		run_inference.py
run_inference_simple.py		run_inference_simple.py
run_mae_double_pretraining.py		run_mae_double_pretraining.py
run_mae_pretraining.py		run_mae_pretraining.py
ssv2.py		ssv2.py
test_efficiency.py		test_efficiency.py
transforms.py		transforms.py
utils.py		utils.py
video_transforms.py		video_transforms.py
vis.sh		vis.sh
volume_transforms.py		volume_transforms.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Simplifying Traffic Anomaly Detection with Video Foundation Models

✨ DoTA and DADA-2000 results

📍Model Zoo

🔨 Installation

🗄️ Data Preparation

🔄 Domain-Adaptive Pre-training (DAPT) and Fine-tuning

🚀 Evaluation and Inference

☎️ Contact

👍 Acknowledgements

🔒 License

✏️ Citation

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

tue-mps/simple-tad

Folders and files

Latest commit

History

Repository files navigation

Simplifying Traffic Anomaly Detection with Video Foundation Models

✨ DoTA and DADA-2000 results

📍Model Zoo

🔨 Installation

🗄️ Data Preparation

🔄 Domain-Adaptive Pre-training (DAPT) and Fine-tuning

🚀 Evaluation and Inference

☎️ Contact

👍 Acknowledgements

🔒 License

✏️ Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages