Lightning Talk
Featured Lightning Talk
-
SimpleTES: Evaluation-driven Scaling for Scientific Discovery
All Accepted Papers
- Agentic Architect: An Agentic AI Framework for Architecture Design Exploration and Optimization
- AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment
- AI-PROPELLER: Warehouse-Scale Interprocedural Code Layout Optimization with AlphaEvolve Oral
- AttackEvolve: Using In-Context Learning Enhanced Searches to Improve the Search Efficiency of LM-Based Search Algorithms
- Autonomous Agent Learning in Production
- Beyond Fault Injection: Leveraging LLMs for Autonomous Chaos Engineering
- BIORESEARCHER: Scenario-Guided Multi-Agent for Translational Medicine
- CadAgent: A Multi-Agent System for Manufacturing Process Classification from 2D Engineering Drawings
- Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows
- ClinSeekAgent: Automating Multi-modal Evidence Seeking for Agentic Clinical Reasoning
- Context or Capability? Debugging Agentic Workflows
- Declarative Data Services: Structured Agentic Discovery for Composing Data Systems
- DeepRoot: A KG-Coordinated Multi-Agent System for Therapeutic Reasoning over Historical Medical Texts
- Deploying Agents in the Wild: Failure Modes from Healthcare Access Optimization
- Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas
- Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics
- Evolution Fine-Tuning: Learning to Discover Across 371 Optimization Tasks Oral
- Exploring Structures in Physics Problems: Can AI Agents Discover Statistical Mechanical Mappings?
- Foundry: Host-Owned Trust and Memory for Long-Horizon Agent Swarms
- How Do Tool-Augmented LLM Agents Perform on Real-World Energy Analytics Tasks?
- Interpretable Early Termination of Web Navigation Agents via Closed Sequential Pattern Mining
- Knowing When to Ask: Self-Gated Clarification for Hierarchical Language Agents
- LEVI: Stronger Search Architectures Can Substitute for Larger LLMs in Evolutionary Search
- LiteSR: Literature-Guided Agentic Retrieval for Symbolic Regression
- MatPref: Training the Reasoning Backbone of Materials Discovery Agents with Verifiable Rewards
- Meta-Harness: Harness Search for Agents Under Expensive Evaluation Oral
- PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents
- PaperDoctor: Evidence-Grounded and Actionable Feedback for Scientific Papers in Progress
- PromptKV: A Workflow for Building AI-Driven Distributed KV Stores
- RankEvolve: Automating the Discovery of Retrieval Algorithms via LLM-Driven Evolution
- Red-Teaming Claude and ChatGPT-based Security Advisors for Trusted Execution Environments
- ScientistOne: Verifiable Autonomous Research via Chain-of-Evidence
- Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized Execution Trace
- Side Effects Are the Output: Evaluating AI Agents That Act on Live Systems
- Spilling the TE: Lessons from AI-driven evolution of Traffic Engineering
- Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution
- Stage–Audit: Auditable Source-Frontier Discovery for Cross-Wiki Tables
- Stochastic Agent Descent: Adaptive Agents for the Future of Non-Convex Optimization
- The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice
- Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw