According to 1M AI News monitoring, researchers from Stanford, MIT, and the Korean game company KRAFTON released Meta-Harness, a methodology for an AI-driven execution framework that automatically optimizes performance (a harness, i.e., a wrapper model and an execution scaffold that drives agent actions, covering prompt design, tool calls, and context management). Unlike manually written execution frameworks, Meta-Harness has a coding agent read the code, execution logs, and scores of prior candidate frameworks to automatically iterate and improve them.
On the TerminalBench-2 terminal operation benchmark, Meta-Harness brought Claude Haiku 4.5’s success rate to 37.6%, surpassing Goose (35.5%) and Claude Code (27.5%), ranking first among all reported Haiku 4.5 execution frameworks. On Claude Opus 4.6, the success rate was 76.4%, ranking second.
Lin Junyang, the former technical lead of Qianwen (Tongyi), forwarded the post from the paper’s authors and commented: “ ‘Model + execution framework’ has already surpassed ‘just the model.’ An agent’s performance will be significantly influenced by the design and quality of the framework, and I truly believe this is the right direction.” In a long post he published on March 27 (now deleted), Lin Junyang had already predicted that environment design would evolve from a side project into a true startup category. Meta-Harness backed up this judgment with experimental data: using the same model, switching to an AI-optimized execution framework could produce performance differences of up to 10 percentage points.