Dòng tin

1 nội dung mới nhất
Tất cả
DAIR.AI
DAIR.AIXBài đăng·5 ngày trước
Mô Hình Mạnh Hơn Không Cần Harness Phức Tạp Hơn
Stronger models do not always need lighter harnesses. Everyone believes more structured harnesses universally improve reliability, and that higher-capability models need proportionally less structural guidance. Together, that implies a clean inverse relationship between model tier and optimal harness complexity. This new research tests it with a controlled 432-run experiment, six models across four capability tiers crossed with three harness conditions, on a 24-task benchmark with git-based workspace verification. For a frontier chat model, increasing harness verbosity dropped success by 29 to 38 percentage points. They call it the harness-complexity paradox. Paper: https://arxiv.org/abs/2605.26731 Learn to build effective AI agents in our academy: https://academy.dair.ai/
  • Quan sát ngược lại trực giác thông thường: tăng tính phức tạp của harness lại giảm hiệu suất của các mô hình mạnh hơn.