User Instruction
View on GitHubPlease check my company email (http://localhost:5174/, open it in browser) to see if I have received any recent emails from GKD Airlines. If there are any notifications regarding flight changes, please check if the changes affect my schedule (you can view my schedule in http://localhost:3000/). If a conflict arises, you can send an email to remind my fellow travelers. Thank you.
Task Description
EN: Check the inbox for flight status change notifications from the airline; if found, check the calendar for affected plans and notify co-travelers
中文: 让OpenClaw检查邮箱,是否有关于航司发来的预定航班状态变化的通知;如有,则需要检查日程应用中是否存在受到影响的计划,并能够提醒同程伙伴
Complexity Factors
A1
✓
Cross-Service Dependency
A2
✗
Contaminated Initial State
B1
✓
Implicit Goal Resolution
B2
✗
Knowledge System Maintenance
C1
✗
Environmental State Invalidation
C2
✗
Outcome Verification under Altered State
Evaluation
Verifier Type:
verify.py Partial Credit: Yes
Reward Range:
0 – 1 Results for This Task
| Model | Avg Score | Attempts | All Passed |
|---|---|---|---|
| deepseek-v4-flash | 0.333 | 3 | ✗ |
| deepseek-v4-pro | 0.333 | 3 | ✗ |
| gpt-5.5 | 0.333 | 3 | ✗ |
| qwen3.6-flash | 0.333 | 3 | ✗ |
| qwen3.5-27b | 0 | 3 | ✗ |
| qwen3.5-flash | 0 | 3 | ✗ |
| qwen3.5-397b-a17b | 0 | 3 | ✗ |
| qwen3.6-27b | 0 | 3 | ✗ |
| qwen3.6-plus | 0 | 3 | ✗ |
Public Trajectories
Run trajectories for this task live on HuggingFace.
View trajectories on HuggingFace