When AI Says It Feels
When AI Says It Feels
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
DiG-Plan: Mitigating Early Commitment for Tool-Graph Planning via Diffusion Guidance
DiG-Plan: Mitigating Early Commitment for Tool-Graph Planning via Diffusion Guidance
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Critic-Guided Heterogeneous Multi-Agent Reasoning for Reliable Mathematical Problem Solving
Critic-Guided Heterogeneous Multi-Agent Reasoning for Reliable Mathematical Problem Solving
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Seeing Time: Benchmarking Chronological Reasoning and Shortcut Biases in Vision-Language Models
Seeing Time: Benchmarking Chronological Reasoning and Shortcut Biases in Vision-Language Models
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
PerceptUI: LLM Agents as Human-Aligned Synthetic Users for UI/UX Evaluation
PerceptUI: LLM Agents as Human-Aligned Synthetic Users for UI/UX Evaluation
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
AdaMEM: Test-Time Adaptive Memory for Language Agents
AdaMEM: Test-Time Adaptive Memory for Language Agents
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillatio
Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillatio
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Do More Agents Help? Controlled and Protocol-Aligned Evaluation of LLM Agent Workflows
Do More Agents Help? Controlled and Protocol-Aligned Evaluation of LLM Agent Workflows
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments
Continual Learning Bench: Evaluating Frontier AI Systems in Real-World Stateful Environments
Topic · 强化学习
仅有原始 MD
Quick Read
LLM failed, fallback used
Coding with "Enemy": Can Human Developers Detect AI Agent Sabotage?
Coding with "Enemy": Can Human Developers Detect AI Agent Sabotage?
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG
FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG
Topic · 记忆
仅有原始 MD
Quick Read
LLM failed, fallback used
Answer Presence Drives RAG Rewriting Gains
Answer Presence Drives RAG Rewriting Gains
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Evaluation of LLMs for Mathematical Formalization in Lean
Evaluation of LLMs for Mathematical Formalization in Lean
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Self-Commitment Latency: A Reward-Free Probe for Prompted Implicit Hacking
Self-Commitment Latency: A Reward-Free Probe for Prompted Implicit Hacking
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Safety Paradox: How Enhanced Safety Awareness Leaves LLMs Vulnerable to Posterior Attack
Safety Paradox: How Enhanced Safety Awareness Leaves LLMs Vulnerable to Posterior Attack
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Multilingual Fine-Tuning via Localized Gradient Conflict Resolution
Multilingual Fine-Tuning via Localized Gradient Conflict Resolution
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Fix the Mind, Not the Move: Interpretable AI Assistance via Knowledge-Gap Localization
Fix the Mind, Not the Move: Interpretable AI Assistance via Knowledge-Gap Localization
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
GuardNet: Ensemble Strategies of Shallow Neural Networks for Robust Prompt Injection and Jailbreak Detection
GuardNet: Ensemble Strategies of Shallow Neural Networks for Robust Prompt Injection and Jailbreak Detection
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations
SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Individual Gain, Collective Loss: Metacognitive Adaptation in AI-Assisted Creativity
Individual Gain, Collective Loss: Metacognitive Adaptation in AI-Assisted Creativity
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used