Meta, Stanford survey highlights code as central layer for AI agents

A new survey paper from Meta, Stanford, and Illinois argues that AI agents perform more effectively when code serves as their primary operational layer, addressing issues like state loss and error management that can arise with long tasks. The research emphasizes the concept of an “agent harness,” which comprises tools, memory, and feedback mechanisms that allow an AI model to function as an agent. The findings highlight that code enables agents to navigate executable steps, utilize program controls, and model environments effectively through various tools like tests and logs. This work reflects a broader trend in research toward enhancing the stability and reliability of AI agents by focusing on executable code environments.

Meta: Meta is a major technology company with significant investments in artificial intelligence research and development. It contributed to the survey paper titled ‘Code as Agent Harness,’ which examines how code can serve as the central environment for AI agents. The work highlights Meta’s ongoing focus on practical frameworks that improve agent reliability through executable systems.
Stanford: Stanford University is a leading academic institution known for its contributions to computer science and AI research. Researchers from Stanford co-authored the ‘Code as Agent Harness’ survey, emphasizing code’s role in enabling agents to reason, act, and model environments more robustly. The paper reflects Stanford’s involvement in advancing agent harness architectures that integrate tools, memory, and feedback mechanisms.
Code as Agent Harness: Code as Agent Harness is the title of a survey paper co-authored by researchers from Meta, Stanford, and Illinois that proposes code as the primary working layer for AI agents. The paper introduces the concept of an agent harness encompassing tools, sandboxes, and feedback loops, with code positioned centrally to support inspection, execution, and revision. It argues that this approach addresses limitations in LLM-based agents for handling complex, long-horizon tasks.

AI Agent Frameworks: Recent research highlights the shift toward executable code environments to enhance AI agent stability and verifiability across diverse applications.
Research Collaboration: Academic and industry teams are increasingly publishing joint surveys on code-centric designs for turning language models into more reliable autonomous agents.