User Instruction
View on GitHubThe bottom-level skills csv-parser, column-calculator, and stats-aggregator have been updated. Please review their changes and update all parent SKILL.md files (data-loader, data-transformer, report-renderer, and the top-level report-generator-pipeline) to reflect the new capabilities, removed features, and changed output schemas.
Task Description
EN: After a user modifies a lower-level SKILL, identify the dependency relationships of higher-level SKILLs that call the lower-level SKILL, and update the higher-level SKILLs accordingly
中文: 用户指令底层skill修改后,能否准确识别上层skill对于底层skill的调用关系,并理顺上层skill对底层skill的调用
Complexity Factors
A1
✗
Cross-Service Dependency
A2
✗
Contaminated Initial State
B1
✗
Implicit Goal Resolution
B2
✓
Knowledge System Maintenance
C1
✗
Environmental State Invalidation
C2
✗
Outcome Verification under Altered State
Evaluation
Verifier Type:
evaluate.py Partial Credit: Yes
Reward Range:
0 – 1 Results for This Task
| Model | Avg Score | Attempts | All Passed |
|---|---|---|---|
| deepseek-v4-flash | 1 | 3 | ✓ |
| gpt-5.5 | 1 | 3 | ✓ |
| qwen3.5-397b-a17b | 1 | 3 | ✓ |
| qwen3.6-27b | 1 | 3 | ✓ |
| qwen3.6-flash | 1 | 3 | ✓ |
| qwen3.6-plus | 1 | 3 | ✓ |
| qwen3.5-flash | 0.637 | 3 | ✗ |
| qwen3.5-27b | 0.513 | 3 | ✗ |
| deepseek-v4-pro | 0.38 | 3 | ✗ |
Public Trajectories
Run trajectories for this task live on HuggingFace.
View trajectories on HuggingFaceSource Files
task.toml: tasks/skill-dependency-fix/task.toml
instruction: tasks/skill-dependency-fix/instruction.md
environment: tasks/skill-dependency-fix/environment/Dockerfile