A new leader for open-source software development: AGENTLESS. Best in class according to performance metrics on the SWE-bench Lite benchmark, including other open source agents w/ GPT-4 omni and Claude 3.5, like Aider or Devon or SWE-Agent.
AGENTLESS challenges the prevailing notion that complex autonomous agents are necessary for automating software development tasks. It leverages a simplified two-phase process-localization and repair-without the need for agents to decide on future actions or manage complex tools. This simplicity not only reduces the cognitive load involved in understanding and debugging the process but also significantly lowers operational costs. The empirical results demonstrate that AGENTLESS achieves competitive performance metrics on the SWE-bench Lite benchmark, outperforming all other open-source approaches in both effectiveness and efficiency.
The insights from AGENTLESS's performance suggest a paradigm shift in the development of software engineering tools, emphasizing the effectiveness of simpler, more interpretable methods over complex autonomous systems. This approach not only makes it easier to understand and maintain the system but also highlights the potential for significant cost savings and efficiency improvements. The research encourages further exploration into refining these simplistic approaches, suggesting that future advancements could focus on enhancing the accuracy of localization and repair mechanisms and exploring new forms of integration with existing development environments.
All rights w/ authors:
AGENTLESS :
Demystifying LLM-based Software Engineering Agents
arxiv.org/pdf/2407.01489
#airesearch
#newtech
#science
10 июл 2024