Reading List
1. LLM Agent Frameworks
- Junyu Luo, et al. Large Language Model Agent: A Survey on Methodology, Applications and Challenges. https://arxiv.org/abs/2503.21460, 2025.
- Shen et al. From Mind to Machine: The Rise of Manus AI as a Fully Autonomous Digital Agent. https://arxiv.org/abs/2505.02024, 2025.
2. Multi-Agent Systems
- Taicheng Guo, et al. Large Language Model Based Multi-Agents: A Survey of Progress and Challenges. https://arxiv.org/abs/2402.01680, 2024.
- Hong et al. MetaGPT: Meta‑Programming for A Multi‑Agent Collaborative Framework. https://arxiv.org/abs/2308.00352, 2024.
3. Human-Agent Collaboration
- Henry Peng Zou, et al. LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey. https://arxiv.org/abs/2505.00753, 2025.
- Daoguang Zan, et al. CodeS: Natural Language to Code Repository via Multi-Layer Sketch. https://arxiv.org/abs/2403.16443, 2024
4. Requirement Engineering
- Lezhi Ma, et al. Specgen: Automated generation of formal program specifications via large language models. https://arxiv.org/2401.08807, 2024.
- Dongming Jin, et al. MARE: multi-agents collaboration framework for requirements engineering. https://arxiv.org/abs/2405.03256, 2024.
5. Code Generation
- Juyong Jiang, et al. A survey on large language models for code generation. https://arxiv.org/abs/2406.00515, 2024.
- Zhang et al. CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. https://arxiv.org/abs/2401.07339. 2024
6. Static Code Checking
- Wang et al. A Contemporary Survey of Large Language Model Assisted Program Analysis. https://arxiv.org/abs/2502.18474.
- Li et al., LLM-Assisted Static Analysis for Detecting Security Vulnerabilities. https://arxiv.org/abs/2405.17238. 2024
7. Testing
- Alshahwan, et al. Automated Unit Test Improvement using Large Language Models at Meta. https://arxiv.org/abs/2402.09171. 2024.
- Juan Altmayer Pizzorno and E. Berger. Coverup: Coverageguided llm-based test generation. https://arxiv.org/abs/2403.16218, 2024.
8. Debugging
- Yihao Qin et al. Agentfl: Scaling llm-based fault localization to project-level context. https://arxiv.org/abs/2403.16362, 2024.
- Cheryl Lee, et al. A unified debugging approach via llm-based multi-agent synergy. https://arxiv.org/abs/2404.17153, 2024.
9. Evaluations
- Carlos E. Jimenez, et al. Swe-bench: Can language models resolve real-world github issues? https://arxiv.org/abs/2310.06770, 2023.
- Chunqiu Steven Xia, et al. Agentless: Demystifying llm-based software engineering agents. https://arxiv.org/abs/2407.01489, 2024.
10. End-to-End Software Development
- Zhang et al. Empowering Agile-Based Generative Software Development through Human-AI Teamwork. https://arxiv.org/abs/2407.15568. 2024
- Liu et al. Large Language Model-Based Agents for Software Engineering: A Survey. https://arxiv.org/abs/2409.02977. 2024