flight-seat-selection Case #9

Medium Domain: E-commerce & Daily Svcs airlineemail

User Instruction

View on GitHub

I recently booked a flight and received a booking notification in my company email system (http://localhost:5174/, open it in browser). Please help me complete the check-in and seat selection. I'd like a window seat.

Task Description

EN: Help the user check in and select a seat for a specified booked flight; first retrieve the booking details from email, then select a seat on the airline website according to user requirements

中文: 让OpenClaw帮助用户完成指定预定航班的值机选座,需要首先从邮件中找到预定航班的信息,并在航司网站按照用户要求进行选座

Complexity Factors

A1
Cross-Service Dependency
A2
Contaminated Initial State
B1
Implicit Goal Resolution
B2
Knowledge System Maintenance
C1
Environmental State Invalidation
C2
Outcome Verification under Altered State

Evaluation

Verifier Type: verify.py
Partial Credit: Yes
Reward Range: 0 – 1

Results for This Task

Model Avg Score Attempts All Passed
deepseek-v4-pro 1 3
qwen3.6-plus 1 3
qwen3.6-27b 0.9 3
deepseek-v4-flash 0.667 3
qwen3.5-flash 0.667 3
qwen3.5-397b-a17b 0.667 3
qwen3.6-flash 0.667 3
qwen3.5-27b 0.333 3
gpt-5.5 0 3

Public Trajectories

Run trajectories for this task live on HuggingFace.

View trajectories on HuggingFace