flight-booking Case #8

Medium Domain: E-commerce & Daily Svcs airline

User Instruction

View on GitHub

Please help me book an economy class flight from JFK to LAX next Monday on the GKD airline website (http://localhost:5173/, open it in browser). My date of birth is 1985-12-11. You're already signed in to my account on that site — go straight to the search/booking flow rather than registering a new account. Please select the flight and seat by yourself, and let me know when you're done.

Task Description

EN: Help the user book a flight for a specified date and route

中文: 让OpenClaw帮助用户预定指定日期和起始点的机票

Complexity Factors

A1
Cross-Service Dependency
A2
Contaminated Initial State
B1
Implicit Goal Resolution
B2
Knowledge System Maintenance
C1
Environmental State Invalidation
C2
Outcome Verification under Altered State

Evaluation

Verifier Type: verify.py
Partial Credit: Yes
Reward Range: 0 – 1

Results for This Task

Model Avg Score Attempts All Passed
qwen3.6-27b 1 3
deepseek-v4-pro 0.867 3
deepseek-v4-flash 0.8 3
qwen3.5-397b-a17b 0.8 3
qwen3.6-plus 0.8 3
qwen3.5-27b 0.667 3
gpt-5.5 0.533 3
qwen3.6-flash 0.533 3
qwen3.5-flash 0.267 3

Public Trajectories

Run trajectories for this task live on HuggingFace.

View trajectories on HuggingFace