Ego-view Accident Video Diffusion

Video Diffusion Banner

This is our ego-view accident video generation benchmark that can be driven by different text descriptions annotated in MM-AU. The performance is measured by the CLIP Score (CLIPs), Fréchet Video Distance (FVD) and Frames Per Second (FPS). We aim to explore the cause-effect evolution of accident videos conditioned by the descriptions of accident reasons or prevention advice.

Leaderboard

ID Method Year Code CLIPs FVD FPS Environment
1 Tune-A-Video 2023 Link 21.77 9545.6 1.7 RTX 3090
2 ControlVideo 2023 Link 22.51 12275.2 0.5 RTX 3090
3 OVAD 2023 Link 27.24 5238.1 1.2 RTX 3090
4 ModelScope T2V 2023 Link 27.15 5088.7 1.3 RTX 3090
5 Text2Video-Zero 2023 Link 27.89 12547.0 1.1 RTX 3090
6 Causal-VidSyn 2025 Link 28.70 5352.9 ... RTX 3090

Submit Your Results

You can submit your metric values via the provided form. Furthermore, we would highly appreciate your contribution with clear links to relevant articles and code for more in-depth analysis.

Click here to submit your results:

Submit