**Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.
https://github.com/microsoft/DeepVideoDiscovery.git
This repository contains the official implementation of the paper Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding, which achieves the state-of-the-art performance by a large margin on multiple long video benchmarks including the challenging LVBench.
lite_mode to enable a lightweight version of the agent that uses only subtitles. Good for Youtube podcast analysis!Deep Video Discovery (DVD) is a deep-research style question answering agent designed for understanding extra-long videos. Leveraging the powerful capabilities of large language models (LLMs), DVD effectively interprets and processes extensive video content to answer complex user queries.
https://github.com/user-attachments/assets/26d4d524-bdf0-48a5-9d33-ce19fa7779fb
DVD Achieves state-of-the-art performance by a large margin on multiple long video benchmarks using OpenAI o3.
The core design of DVD includes:
git clone https://github.com/microsoft/deepvideodiscovery.git
cd DeepVideoDiscovery
pip install -r requirements.txt
pip install gradio
Note: Set up your configuration by updating the variables in config.py.
The local_run.py script provides an example of how to run the Deep Video Discovery agent by providing a youtube url and question about it.
python local_run.py https://www.youtube.com/watch?v=PQFQ-3d2J-8 "what did the main speaker talk about in the last part of video?"
Compared to the original implementation, we have made the following changes:
global_browse_tool we leverage the textual description (rather than original video pixels) of multiple video clips to provide global overview of the video content to improve efficency.If you find our work useful, please consider citing:
@article{zhang2025deep,
title={Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding},
author={Zhang, Xiaoyi and Jia, Zhaoyang and Guo, Zongyu and Li, Jiahao and Li, Bin and Li, Houqiang and Lu, Yan},
journal={arXiv preprint arXiv:2505.18079},
year={2025}
}