User Instruction
View on GitHubYour workspace notes about speculative decoding contain stale claims. Review the local corpus in `corpus/` and use the browser portal in `tools/` to find stronger evidence, then repair your notes and write `~/.openclaw/output/result.json`.
Task Description
EN: Old durable notes contain obvious errors; use local materials and an internal browser portal to distinguish Adaptive Cache Bridging's key mechanisms from incorrect shorthand
中文: 旧 durable note 里有明显错误,需要结合本地材料和 internal browser portal,把 Adaptive Cache Bridging 的关键机制与错误 shorthand 区分开。
Complexity Factors
A1
✓
Cross-Service Dependency
A2
✗
Contaminated Initial State
B1
✗
Implicit Goal Resolution
B2
✓
Knowledge System Maintenance
C1
✗
Environmental State Invalidation
C2
✗
Outcome Verification under Altered State
Evaluation
Verifier Type:
llm_judge.py Partial Credit: Yes
Reward Range:
0 – 1
LLM Judge Task
This task uses an LLM-based judge for evaluation, which requires judge credentials to run.
Results for This Task
| Model | Avg Score | Attempts | All Passed |
|---|---|---|---|
| qwen3.5-27b | 0.979 | 3 | ✗ |
| qwen3.6-27b | 0.954 | 3 | ✗ |
| qwen3.6-plus | 0.892 | 3 | ✗ |
| qwen3.5-397b-a17b | 0.888 | 3 | ✗ |
| deepseek-v4-flash | 0.862 | 3 | ✗ |
| gpt-5.5 | 0.821 | 3 | ✗ |
| deepseek-v4-pro | 0.792 | 3 | ✗ |
| qwen3.6-flash | 0.763 | 3 | ✗ |
| qwen3.5-flash | 0.649 | 3 | ✗ |
Public Trajectories
Run trajectories for this task live on HuggingFace.
View trajectories on HuggingFaceSource Files
task.toml: tasks/conflict-repair-acb/task.toml
instruction: tasks/conflict-repair-acb/instruction.md
environment: tasks/conflict-repair-acb/environment/Dockerfile