| SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent | Agents | Yuyang Hu +7 | May 23, 2026 | 9 |
| Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions | Agents | Zhenting Qi +15 | Jun 1, 2026 | 8 |
| FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search | Agents | James Xu Zhao +3 | May 30, 2026 | 8 |
| AgensFlow: A Coordination-Policy Substrate for Multi-Agent Systems | Agents | Nicole Koenigstein | May 26, 2026 | 8 |
| FastKernels: Benchmarking GPU Kernel Generation in Production | Agents | Gabriele Oliaro +7 | May 22, 2026 | 8 |
| LACUNA: Safe Agents as Recursive Program Holes | Agents | Yaoyu Zhao +5 | May 27, 2026 | 7 |
| Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents | Agents | Asaf Yehudai +2 | May 21, 2026 | 7 |
| DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning | Agents | Lingyong Yan +15 | Jun 5, 2026 | 6 |
| Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory | Agents | Ruida Wang +5 | Jun 2, 2026 | 6 |
| HarnessForge: Joint Harness and Policy Evolution for Adaptive Agent Systems | Agents | Mingju Chen +4 | Jun 1, 2026 | 6 |
| Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents | Agents | Ziyan Liu +9 | May 28, 2026 | 6 |
| AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification | Agents | Yan Wang +9 | Jun 2, 2026 | 6 |
| OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents | Agents | Chenyu Zhou +5 | May 27, 2026 | 6 |
| ESC-Skills: Discovering and Self-Evolving Skills for Emotional Support Conversations | Agents | Jie Zhu +7 | May 27, 2026 | 6 |
| Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization | Agents | Anmol Agarwal +8 | May 26, 2026 | 6 |
| AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning | Agents | Yuyang Hu +7 | May 23, 2026 | 6 |
| How Far Will They Go? Red-Teaming Online Influence with Large Language Models | Agents | Daniel C. Ruiz +4 | May 20, 2026 | 6 |
| Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests | Agents | Thanawat Lodkaew +5 | Jun 5, 2026 | 5 |
| What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems | Agents | Chen Huang +2 | Jun 3, 2026 | 5 |
| SePO: Self-Evolving Prompt Agent for System Prompt Optimization | Agents | Wangcheng Tao +2 | Jun 3, 2026 | 5 |
| Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents | Agents | Ailiya Borjigin +6 | Jun 1, 2026 | 5 |
| Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints | Agents | Alexi Canesse +3 | May 20, 2026 | 5 |
| Measuring Epistemic Resilience of LLMs Under Misleading Medical Context | Agents | Hongjian Zhou +21 | Jun 10, 2026 | 4 |
| EvoBrowseComp: Benchmarking Search Agents on Evolving Knowledge | Agents | Yunhan Wang +4 | Jun 11, 2026 | 4 |
| Getting Better at Working With You: Compiling User Corrections into Runtime Enforcement for Coding Agents | Agents | Yujun Zhou +10 | Jun 11, 2026 | 4 |
| Towards Retrieving Interaction Spaces for Agentic Search | Agents | Shengyao Zhuang +4 | Jun 5, 2026 | 4 |
| DAR: Deontic Reasoning with Agentic Harnesses | Agents | Guangyao Dou +3 | Jun 3, 2026 | 4 |
| STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media | Agents | Liang Xue +5 | May 24, 2026 | 4 |
| See What I See, Know What I Think: Dense Latent Communication Across Heterogeneous Agents | Agents | Siyi Chen +9 | Jun 11, 2026 | 3 |
| Evoflux: Inference-Time Evolution of Executable Tool Workflows for Compact Agents | Agents | Kushal Raj Bhandari +6 | Jun 10, 2026 | 3 |
| Which Models Are Our Models Built On? Auditing Invisible Dependencies in Modern LLMs | Agents | Sanjay Adhikesaven +2 | Jun 10, 2026 | 3 |
| Decentralized Multi-Agent Systems with Shared Context | Agents | Yuzhen Mao +1 | Jun 9, 2026 | 3 |
| Experience Makes Skillful: Enabling Generalizable Medical Agent Reasoning via Self-Evolving Skill Memory | Agents | Haoran Sun +10 | Jun 8, 2026 | 3 |
| EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management | Agents | Zherui Yang +3 | Jun 2, 2026 | 3 |
| Evaluating Large Language Models in Dynamic Clinical Decision-Making with Standardized Patient Cases | Agents | Cheng Liang +5 | Jun 3, 2026 | 3 |
| Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents | Agents | Yingqi Zhang | Jun 2, 2026 | 3 |
| The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development? | Agents | Xinyu Lu +10 | Jun 3, 2026 | 3 |
| The Cold-Start Safety Gap in LLM Agents | Agents | Chung-En Sun +2 | Jun 5, 2026 | 2 |
| ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs | Agents | Ashutosh Hathidara +3 | Jun 4, 2026 | 2 |
| PaperMentor: A Human-Centered Multi-Agent Writing Tutor for AI Research Papers on Overleaf | Agents | Jiarui Liu +19 | Jun 7, 2026 | 2 |
| LayerRoute: Input-Conditioned Adaptive Layer Skipping via LoRA Fine-Tuning for Agentic Language Models | Agents | Prateek Kumar Sikdar | Jun 1, 2026 | 2 |
| AgentCL: Toward Rigorous Evaluation of Continual Learning in Language Agents | Agents | Yiheng Shu +5 | Jun 2, 2026 | 2 |
| Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas | Agents | Víctor Gallego | May 28, 2026 | 2 |
| τ-Rec: A Verifiable Benchmark for Agentic Recommender Systems | Agents | Bharath Sivaram Narasimhan +1 | Jun 8, 2026 | 1 |
| Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops | Agents | Ziqian Zhong +5 | Jun 8, 2026 | 1 |
| PBSD: Privileged Bayesian Self-Distillation for Long-Horizon Credit Assignment | Agents | Yang Tian +7 | Jun 8, 2026 | 1 |
| Honest Lying: Understanding Memory Confabulation in Reflexive Agents | Agents | Prakhar Dixit +2 | May 28, 2026 | 1 |
| Parametric Social Identity Injection and Diversification in Public Opinion Simulation | Agents | Hexi Wang +4 | Jun 1, 2026 | 1 |
| AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents | Agents | Yang Li +3 | Jun 4, 2026 | 1 |
| LLM Anonymization Against Agentic Re-Identification | Agents | Ziwen Li +2 | Jun 1, 2026 | 1 |
| Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning | Agents | Yu Xia +6 | Jun 2, 2026 | 1 |
| AI, Take the Wheel: What Drives Delegation and Trust in Human-Computer Cooperative Question Answering? | Agents | Maharshi Gor +6 | May 27, 2026 | 1 |
| Beyond Recall: Behavioral Specification as an Interpretive Layer for AI Personalization | Agents | Aarik Gulaya | May 27, 2026 | 1 |
| ORACLE: Anticipating Scams from Partial Trajectories in Streaming App Usage | Agents | Wenbo Gao +8 | May 9, 2026 | 1 |
| Got a Secret? LLM Agents Can't Keep It: Evaluating Privacy in Multi-Agent Systems | Agents | Aman Priyanshu +2 | May 26, 2026 | 1 |