flight-info-change-notice Case #12

Hard Domain: Calendar & Task Mgmt airlineemailtodolist

User Instruction

View on GitHub

Please check my company email (http://localhost:5174/, open it in browser) to see if I have received any recent emails from GKD Airlines. If there are any notifications regarding flight changes, please check if the changes affect my schedule (you can view my schedule in http://localhost:3000/). If a conflict arises, you can send an email to remind my fellow travelers. Thank you.

Task Description

EN: Check the inbox for flight status change notifications from the airline; if found, check the calendar for affected plans and notify co-travelers

中文: 让OpenClaw检查邮箱,是否有关于航司发来的预定航班状态变化的通知;如有,则需要检查日程应用中是否存在受到影响的计划,并能够提醒同程伙伴

Complexity Factors

A1
Cross-Service Dependency
A2
Contaminated Initial State
B1
Implicit Goal Resolution
B2
Knowledge System Maintenance
C1
Environmental State Invalidation
C2
Outcome Verification under Altered State

Evaluation

Verifier Type: verify.py
Partial Credit: Yes
Reward Range: 0 – 1

Results for This Task

Model Avg Score Attempts All Passed
deepseek-v4-flash 0.333 3
deepseek-v4-pro 0.333 3
gpt-5.5 0.333 3
qwen3.6-flash 0.333 3
qwen3.5-27b 0 3
qwen3.5-flash 0 3
qwen3.5-397b-a17b 0 3
qwen3.6-27b 0 3
qwen3.6-plus 0 3

Public Trajectories

Run trajectories for this task live on HuggingFace.

View trajectories on HuggingFace