Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models
Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces
Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval
CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Indexing the Unreadable: LLM-Native Recursive Construction and Search of Service Taxonomies
Indexing the Unreadable: LLM-Native Recursive Construction and Search of Service Taxonomies
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop
When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop
Topic · 大模型后训练
仅有原始 MD
Quick Read
LLM failed, fallback used
Harmonizing Real-Time Constraints and Long-Horizon Reasoning: An Asynchronous Agentic Framework for Dynamic Scheduling
Harmonizing Real-Time Constraints and Long-Horizon Reasoning: An Asynchronous Agentic Framework for Dynamic Scheduling
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
OpenClawBench: Benchmarking Process-side Anomalies in Real-world Agent Execution Trajectories
OpenClawBench: Benchmarking Process-side Anomalies in Real-world Agent Execution Trajectories
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Provably Secure Agent Guardrail
Provably Secure Agent Guardrail
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
DenseSteer: Steering Small Language Models towards Dense Math Reasoning
DenseSteer: Steering Small Language Models towards Dense Math Reasoning
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Surfacing Isolated Learners with Outcome-Independent Mediation of Feedback between Teachers and Students Using AI
Surfacing Isolated Learners with Outcome-Independent Mediation of Feedback between Teachers and Students Using AI
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Rethinking Literature Search Evaluation: Deep Research Helps, and Human Citation Lists Are Not a Ground Truth
Rethinking Literature Search Evaluation: Deep Research Helps, and Human Citation Lists Are Not a Ground Truth
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Tailoring the Curriculum: Student-Centered Reasoning Distillation via Dynamic Data-Model Compatibility
Tailoring the Curriculum: Student-Centered Reasoning Distillation via Dynamic Data-Model Compatibility
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents
BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
GTA: Generating Long-Horizon Tasks for Web Agents at Scale
GTA: Generating Long-Horizon Tasks for Web Agents at Scale
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
ReasonOps: Operator Segmentation for LLM Reasoning Traces
ReasonOps: Operator Segmentation for LLM Reasoning Traces
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Paper Agents, Paper Gains: An Empirical Analysis of DeFi Investment Agents
Paper Agents, Paper Gains: An Empirical Analysis of DeFi Investment Agents
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Better Later Than Sooner: Neuro-Symbolic Knowledge Graph Construction via Ontology-grounded Post-extraction Correction
Better Later Than Sooner: Neuro-Symbolic Knowledge Graph Construction via Ontology-grounded Post-extraction Correction
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Governing Technical Debt in Agentic AI Systems
Governing Technical Debt in Agentic AI Systems
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
The Confidence Shortcut: A Reasoning Failure Mode of Masked Diffusion Models
The Confidence Shortcut: A Reasoning Failure Mode of Masked Diffusion Models
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
PRO-CUA: Process-Reward Optimization for Computer Use Agents
PRO-CUA: Process-Reward Optimization for Computer Use Agents
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used