Academic Journal

LMVD: A large-scale multimodal vlog dataset for depression detection in the wild

Bibliographic Details
Title: LMVD: A large-scale multimodal vlog dataset for depression detection in the wild
Authors: He, Lang, Chen, Kai, Zhao, Junnan, Wang, Yimeng, Pei, Ercheng, Chen, Haifeng, Jiang, Jiewei, Zhang, Shiqing, Zhang, Jie, Wang, Zhongmin, He, Tao, Tiwari, Prayag, 1991
Source: Information Fusion. 126(B):1-11
Subject Terms: Deep Learning, Depression Detection, Multimodal, Transformer, Vlog, Behavioral Research, Data Privacy, Human Computer Interaction, Interactive Computer Systems, Large Datasets, Learning Systems, Multimedia Systems, Academic Achievements, Large-scales, Multi-modal, Multiple Dimensions, Overall Quality, Quality Of Life
Description: Depression profoundly impacts multiple dimensions of an individual's life, including personal and social functioning, academic achievement, occupational productivity, and overall quality of life. With recent advancements in affective computing, deep learning technologies have been increasingly adopted to identify patterns indicative of depression. However, due to concerns over participant privacy, data in this domain remain scarce, posing significant challenges for the development of robust discriminative models for depression detection. To address this limitation, we build a Large-scale Multimodal Vlog Dataset (LMVD) for depression recognition in real-world settings. The LMVD dataset comprises 1,823 video samples, totaling approximately 214 h of content, collected from 1,475 participants across four major multimedia platforms: Sina Weibo, Bilibili, TikTok, and YouTube. In addition, we introduce a novel architecture, MDDformer, specifically designed to capture non-verbal behavioral cues associated with depressive states. Extensive experimental evaluations conducted on LMVD demonstrate the superior performance of MDDformer in depression detection tasks. We anticipate that LMVD will become a valuable benchmark resource for the research community, facilitating progress in multimodal, real-world depression recognition. The dataset and source code will be made publicly available at: https://github.com/helang818/LMVD. © 2025 Elsevier B.V., All rights reserved.
File Description: print
Access URL: https://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-57350
https://doi.org/10.1016/j.inffus.2025.103632
Database: SwePub
Description
ISSN:15662535
18726305
DOI:10.1016/j.inffus.2025.103632