User Instruction
View on GitHubI recently booked a flight and received a booking notification in my company email system (http://localhost:5174/, open it in browser). Please help me complete the check-in and seat selection. I'd like a window seat.
Task Description
EN: Help the user check in and select a seat for a specified booked flight; first retrieve the booking details from email, then select a seat on the airline website; when user seat requirements cannot be met, find the best alternative
中文: 让OpenClaw帮助用户完成指定预定航班的值机选座,需要首先从邮件中找到预定航班的信息,并在航司网站按照用户要求进行选座;当用户选座要求无法被满足时,期望OpenClaw找到最佳替代方案
Complexity Factors
A1
✓
Cross-Service Dependency
A2
✗
Contaminated Initial State
B1
✓
Implicit Goal Resolution
B2
✗
Knowledge System Maintenance
C1
✗
Environmental State Invalidation
C2
✗
Outcome Verification under Altered State
Evaluation
Verifier Type:
verify.py Partial Credit: Yes
Reward Range:
0 – 1 Results for This Task
| Model | Avg Score | Attempts | All Passed |
|---|---|---|---|
| deepseek-v4-flash | 1 | 3 | ✓ |
| deepseek-v4-pro | 1 | 3 | ✓ |
| qwen3.6-27b | 0.667 | 3 | ✗ |
| qwen3.5-27b | 0.333 | 3 | ✗ |
| qwen3.5-397b-a17b | 0.333 | 3 | ✗ |
| gpt-5.5 | 0 | 3 | ✗ |
| qwen3.5-flash | 0 | 3 | ✗ |
| qwen3.6-flash | 0 | 3 | ✗ |
| qwen3.6-plus | 0 | 3 | ✗ |
Public Trajectories
Run trajectories for this task live on HuggingFace.
View trajectories on HuggingFace