User Instruction
View on GitHubAre there any operations that can be automated recently, according to potential patterns within the interaction history in /workspace/environment/history.json? Please update the skills according to the pattern you discovered, and save the updated skills in /workspace/output.
Task Description
EN: Supplement and update an existing SKILL based on interaction history
中文: 根据交互历史,补充更新现有SKILL
Complexity Factors
A1
✗
Cross-Service Dependency
A2
✗
Contaminated Initial State
B1
✗
Implicit Goal Resolution
B2
✓
Knowledge System Maintenance
C1
✗
Environmental State Invalidation
C2
✗
Outcome Verification under Altered State
Evaluation
Verifier Type:
evaluate.py Partial Credit: Yes
Reward Range:
0 – 1 Results for This Task
| Model | Avg Score | Attempts | All Passed |
|---|---|---|---|
| gpt-5.5 | 1 | 3 | ✓ |
| deepseek-v4-pro | 0.889 | 3 | ✗ |
| qwen3.6-27b | 0.889 | 3 | ✗ |
| qwen3.6-flash | 0.556 | 3 | ✗ |
| qwen3.6-plus | 0.444 | 3 | ✗ |
| deepseek-v4-flash | 0.222 | 3 | ✗ |
| qwen3.5-27b | 0.222 | 3 | ✗ |
| qwen3.5-flash | 0.222 | 3 | ✗ |
| qwen3.5-397b-a17b | 0 | 3 | ✗ |
Public Trajectories
Run trajectories for this task live on HuggingFace.
View trajectories on HuggingFaceSource Files
task.toml: tasks/skill-supplementation/task.toml
instruction: tasks/skill-supplementation/instruction.md
environment: tasks/skill-supplementation/environment/Dockerfile