skill-dependency-fix Case #5

Easy Domain: Documents & Knowledge

User Instruction

View on GitHub

The bottom-level skills csv-parser, column-calculator, and stats-aggregator have been updated. Please review their changes and update all parent SKILL.md files (data-loader, data-transformer, report-renderer, and the top-level report-generator-pipeline) to reflect the new capabilities, removed features, and changed output schemas.

Task Description

EN: After a user modifies a lower-level SKILL, identify the dependency relationships of higher-level SKILLs that call the lower-level SKILL, and update the higher-level SKILLs accordingly

中文: 用户指令底层skill修改后,能否准确识别上层skill对于底层skill的调用关系,并理顺上层skill对底层skill的调用

Complexity Factors

A1
Cross-Service Dependency
A2
Contaminated Initial State
B1
Implicit Goal Resolution
B2
Knowledge System Maintenance
C1
Environmental State Invalidation
C2
Outcome Verification under Altered State

Evaluation

Verifier Type: evaluate.py
Partial Credit: Yes
Reward Range: 0 – 1

Results for This Task

Model Avg Score Attempts All Passed
deepseek-v4-flash 1 3
gpt-5.5 1 3
qwen3.5-397b-a17b 1 3
qwen3.6-27b 1 3
qwen3.6-flash 1 3
qwen3.6-plus 1 3
qwen3.5-flash 0.637 3
qwen3.5-27b 0.513 3
deepseek-v4-pro 0.38 3

Public Trajectories

Run trajectories for this task live on HuggingFace.

View trajectories on HuggingFace