Literature

Papers

SortNotable Newest Most cited Oldest A–Z

Notable = Hugging Face daily papers (community-upvoted) · every paper links to arXiv · citations from OpenAlex

TopicAll Vision & multimodal Agents Safety & alignment Code Efficiency & systems Image & video gen Data & benchmarks Robotics Speech & audio Reinforcement learning Theory Science & bio Other LLMs & reasoning

Showing 101–155 of 155 notable papers

Paper	Topic	Authors	Published	HF ▲
SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent	Agents	Yuyang Hu +7	May 23, 2026	9
Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions	Agents	Zhenting Qi +15	Jun 1, 2026	8
FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search	Agents	James Xu Zhao +3	May 30, 2026	8
AgensFlow: A Coordination-Policy Substrate for Multi-Agent Systems	Agents	Nicole Koenigstein	May 26, 2026	8
FastKernels: Benchmarking GPU Kernel Generation in Production	Agents	Gabriele Oliaro +7	May 22, 2026	8
LACUNA: Safe Agents as Recursive Program Holes	Agents	Yaoyu Zhao +5	May 27, 2026	7
Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents	Agents	Asaf Yehudai +2	May 21, 2026	7
DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning	Agents	Lingyong Yan +15	Jun 5, 2026	6
Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory	Agents	Ruida Wang +5	Jun 2, 2026	6
HarnessForge: Joint Harness and Policy Evolution for Adaptive Agent Systems	Agents	Mingju Chen +4	Jun 1, 2026	6
Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents	Agents	Ziyan Liu +9	May 28, 2026	6
AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification	Agents	Yan Wang +9	Jun 2, 2026	6
OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents	Agents	Chenyu Zhou +5	May 27, 2026	6
ESC-Skills: Discovering and Self-Evolving Skills for Emotional Support Conversations	Agents	Jie Zhu +7	May 27, 2026	6
Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization	Agents	Anmol Agarwal +8	May 26, 2026	6
AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning	Agents	Yuyang Hu +7	May 23, 2026	6
How Far Will They Go? Red-Teaming Online Influence with Large Language Models	Agents	Daniel C. Ruiz +4	May 20, 2026	6
Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests	Agents	Thanawat Lodkaew +5	Jun 5, 2026	5
What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems	Agents	Chen Huang +2	Jun 3, 2026	5
SePO: Self-Evolving Prompt Agent for System Prompt Optimization	Agents	Wangcheng Tao +2	Jun 3, 2026	5
Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents	Agents	Ailiya Borjigin +6	Jun 1, 2026	5
Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints	Agents	Alexi Canesse +3	May 20, 2026	5
Measuring Epistemic Resilience of LLMs Under Misleading Medical Context	Agents	Hongjian Zhou +21	Jun 10, 2026	4
EvoBrowseComp: Benchmarking Search Agents on Evolving Knowledge	Agents	Yunhan Wang +4	Jun 11, 2026	4
Getting Better at Working With You: Compiling User Corrections into Runtime Enforcement for Coding Agents	Agents	Yujun Zhou +10	Jun 11, 2026	4
Towards Retrieving Interaction Spaces for Agentic Search	Agents	Shengyao Zhuang +4	Jun 5, 2026	4
DAR: Deontic Reasoning with Agentic Harnesses	Agents	Guangyao Dou +3	Jun 3, 2026	4
STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media	Agents	Liang Xue +5	May 24, 2026	4
See What I See, Know What I Think: Dense Latent Communication Across Heterogeneous Agents	Agents	Siyi Chen +9	Jun 11, 2026	3
Evoflux: Inference-Time Evolution of Executable Tool Workflows for Compact Agents	Agents	Kushal Raj Bhandari +6	Jun 10, 2026	3
Which Models Are Our Models Built On? Auditing Invisible Dependencies in Modern LLMs	Agents	Sanjay Adhikesaven +2	Jun 10, 2026	3
Decentralized Multi-Agent Systems with Shared Context	Agents	Yuzhen Mao +1	Jun 9, 2026	3
Experience Makes Skillful: Enabling Generalizable Medical Agent Reasoning via Self-Evolving Skill Memory	Agents	Haoran Sun +10	Jun 8, 2026	3
EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management	Agents	Zherui Yang +3	Jun 2, 2026	3
Evaluating Large Language Models in Dynamic Clinical Decision-Making with Standardized Patient Cases	Agents	Cheng Liang +5	Jun 3, 2026	3
Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents	Agents	Yingqi Zhang	Jun 2, 2026	3
The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development?	Agents	Xinyu Lu +10	Jun 3, 2026	3
The Cold-Start Safety Gap in LLM Agents	Agents	Chung-En Sun +2	Jun 5, 2026	2
ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs	Agents	Ashutosh Hathidara +3	Jun 4, 2026	2
PaperMentor: A Human-Centered Multi-Agent Writing Tutor for AI Research Papers on Overleaf	Agents	Jiarui Liu +19	Jun 7, 2026	2
LayerRoute: Input-Conditioned Adaptive Layer Skipping via LoRA Fine-Tuning for Agentic Language Models	Agents	Prateek Kumar Sikdar	Jun 1, 2026	2
AgentCL: Toward Rigorous Evaluation of Continual Learning in Language Agents	Agents	Yiheng Shu +5	Jun 2, 2026	2
Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas	Agents	Víctor Gallego	May 28, 2026	2
τ-Rec: A Verifiable Benchmark for Agentic Recommender Systems	Agents	Bharath Sivaram Narasimhan +1	Jun 8, 2026	1
Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops	Agents	Ziqian Zhong +5	Jun 8, 2026	1
PBSD: Privileged Bayesian Self-Distillation for Long-Horizon Credit Assignment	Agents	Yang Tian +7	Jun 8, 2026	1
Honest Lying: Understanding Memory Confabulation in Reflexive Agents	Agents	Prakhar Dixit +2	May 28, 2026	1
Parametric Social Identity Injection and Diversification in Public Opinion Simulation	Agents	Hexi Wang +4	Jun 1, 2026	1
AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents	Agents	Yang Li +3	Jun 4, 2026	1
LLM Anonymization Against Agentic Re-Identification	Agents	Ziwen Li +2	Jun 1, 2026	1
Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning	Agents	Yu Xia +6	Jun 2, 2026	1
AI, Take the Wheel: What Drives Delegation and Trust in Human-Computer Cooperative Question Answering?	Agents	Maharshi Gor +6	May 27, 2026	1
Beyond Recall: Behavioral Specification as an Interpretive Layer for AI Personalization	Agents	Aarik Gulaya	May 27, 2026	1
ORACLE: Anticipating Scams from Partial Trajectories in Streaming App Usage	Agents	Wenbo Gao +8	May 9, 2026	1
Got a Secret? LLM Agents Can't Keep It: Evaluating Privacy in Multi-Agent Systems	Agents	Aman Priyanshu +2	May 26, 2026	1

← PrevPage 2 of 2Next →