Emotional Intelligence Challenge for LLMs in Long-Context Interaction

Registration Open Registration Deadline: September 28, 2025

Large language models (LLMs) make significant progress in Emotional Intelligence (EI) and long-context understanding. In order to more fully realize the potential of models, we organize the Emotional Intelligence Challenge for LLMs in Long-Context Interaction. This competition is conducted based on the LongEmotion benchmark and aims to rigorously evaluate emotional intelligence capabilities of models across multiple dimensions within extended textual contexts.

Organizing Institutions

Artificial Intelligence Research Institute

Shenzhen MSU-BIT University

Guangdong-Hong Kong-Macao Joint Laboratory

Emotional Intelligence and Pervasive Computing

Special Session Chair

Mohamed Jamal Deen

Important Dates

September 28, 2025

Registration Deadline

Complete registration before this date

September 28, 2025

Test Set Release

Competition test set will be available

October 31, 2025

Results Submission

Submit your competition results

November 14-16, 2025

Results Announcement

Winners announced at CloudCom 2025, Shenzhen, China

Awards & Prizes

1st
¥20,000
Gold Medal
2nd
¥10,000
Silver Medal
3rd
¥10,000
Bronze Medal

Competition Tasks

Emotion Classification (EC)

This task requires the model to identify the emotional category of a target entity within long-context texts that contain lengthy spans of context-independent noise. Model performance is evaluated based on the consistency between the predicted label and the ground truth.

Weight: 20%

Emotion Detection (ED)

The model is given N+1 emotional segments. Among them, N segments express the same emotion, while one segment expresses a unique emotion. The model is required to identify the single distinctive emotional segment. During evaluation, the model’s score depends on whether the predicted index matches the ground-truth index.

Weight: 20%

Emotion QA (QA)

In this task, the model is required to answer questions grounded in long-context psychological literature. Model performance is evaluated using the F1 score between its responses and the ground truth answers.

Weight: 20%

Emotion Conversation (MC)

The model acts as a psychological counselor, offering empathetic and context-aware emotional support. Its responses are evaluated using stage-specific criteria based on professional counseling standards, with GPT-4o ensuring consistent and scalable scoring.

Weight: 20%

Emotion Summary (ES)

In this task, the model is required to summarize the following aspects from long-context psychological pathology reports: (1) causes, (2) symptoms, (3) treatment process, (4) illness characteristics, and (5) treatment effects. After generating the model’s response, we employ GPT-4o to evaluate its factual consistency, completeness, and clarity with respect to the reference answer.

Weight: 20%

Scoring System

Overall Score = (EC × 0.2) + (ED × 0.2) + (QA × 0.2) + (MC × 20 × 0.2) + (ES × 20 × 0.2)
Note: MC and ES are displayed as 5-point scores but converted to 100-point scale in the calculation.

Participation Requirements

It is strictly prohibited to use the test set or its data sources directly during training.
Participants are required to submit a technical report that includes detailed information about the training and inference processes.
Participants must submit the GitHub link to their code (including both training and evaluation), the Hugging Face link to their final inference model, and the test result JSONL file.
Participants are required to complete all the tasks and submit the test result JSONL files for each task.
Registration Deadline: September 28, 2025. Please complete the registration form at the provided link above.