PDMR 2025: Privacy-Preserved Multimodal Depression Recognition Challenge

Total Prizes: ¥40,000

Registration Deadline: Sep 28, 2025

Multimodal AI Challenge

PDMR 2025: Privacy-Preserved Multimodal Depression Recognition Challenge

Registration Open Registration Deadline: September 28, 2025

Depression has become a major global public health concern, underscoring the urgent need for effective and reliable automatic recognition methods. However, in real-world scenarios, patients are often reluctant to disclose sensitive personal information, making privacy-preserving depression recognition particularly important.

The Privacy-Preserved Depression Recognition Challenge (PDMR 2025) invites researchers and developers worldwide to advance multimodal learning techniques for depression recognition. Participants will tackle the task using audio and video data. To protect privacy, facial regions in the video recordings are deliberately masked, requiring teams to move beyond traditional facial expression analysis and instead explore innovative strategies that leverage voice characteristics, body movements, and temporal dynamics.

This competition emphasizes challenges such as privacy protection, missing modalities, cross-modal fusion, and robustness, encouraging participants to propose solutions that are not only technically effective but also ethically sound. Ultimately, PDMR 2025 aims to foster the development of depression recognition systems that are more generalizable, interpretable, and practically applicable.

Organizing Institutions

Artificial Intelligence Research Institute

Shenzhen MSU-BIT University

Guangdong-Hong Kong-Macao Joint Laboratory

Emotional Intelligence and Pervasive Computing

Important Dates

September 28, 2025

Registration Deadline

Complete registration before this date

September 28, 2025

Dataset Release

Competition dataset will be available

October 31, 2025

Results Submission

Submit your competition results

November 14-16, 2025

Results Announcement

Winners announced at CloudCom 2025, Shenzhen, China

Awards & Prizes

1st

¥20,000

Gold Medal

2nd

¥10,000

Silver Medal

3rd

¥10,000

Bronze Medal

Dataset Information

Video Data

The raw video data is processed using OpenFace to extract N-dimensional features. The data is stored in .csv files with a shape of [T, N], where:

T denotes the number of frames
N denotes the number of features

Audio Data

The raw audio is first processed with log-Mel filters to extract features of shape [T, 48], and then compressed into a fixed-length vector of [1, 768] using NetVLAD with 16 cluster centers. The data is stored in .csv files with a shape of [1, 768].

Data Structure

train/
├── video/
│   ├── 001.csv
│   ├── 002.csv
│   └── ...
└── audio/
    ├── 001.csv
    ├── 002.csv
    └── ...

test/
├── video/
└── audio/

Competition Task

Task Overview

The main purpose of this classification task is to use artificial intelligence algorithms to complete the feature classification of normal people and depression patients based on the multimodal data provided.

Input

Multimodal data fragment of subjects (.csv file containing 128*500 matrix)

Output

The results of the classification of the depressive disorder of the subjects to which this data segment belongs

Output Format

{
    {
        "data_id": "001",
        "status": "Depression"
    },
    {
        "data_id": "002",
        "status": "Normal"
    },
    {
        "data_id": "003",
        "status": "Normal"
    },
    {
        "data_id": "004",
        "status": "Depression"
    }
}

Evaluation

Participants are required to complete all the tasks and submit the test result JSONL files for each task. Submissions will be evaluated based on their F1 score.

The F1 score measures accuracy using the precision and recall:

Precision is the ratio of true positives (TP) to all predicted positives (TP + FP)
Recall is the ratio of true positives to all actual positives (TP + FN)

F1 Score = 2 × (Precision × Recall) / (Precision + Recall)

Participation Requirements

It is strictly prohibited to use the test set or its data sources directly during training.

Participants are required to submit a technical report that includes detailed information about the training and inference processes.

Participants must submit the GitHub link to their code (including both training and evaluation), the Hugging Face link to their final inference model (upload your model to Hugging Face), and the test result JSON file.

Each participant can be evaluated up to five times.

Registration Deadline: September 28, 2025. Please complete the registration form at the provided link above.