Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming
Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Differentiable Efficient Operator Search
Differentiable Efficient Operator Search
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Where's the Structure? A Systematic Literature Review of Empirical Research on Human-AI Collaboration and Hybrid Intelligence for Learning
Where's the Structure? A Systematic Literature Review of Empirical Research on Human-AI Collaboration and Hybrid Intelligence for Learning
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway
Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
The Score Hamiltonian: Mapping Diffusion Models to Adiabatic Transport
The Score Hamiltonian: Mapping Diffusion Models to Adiabatic Transport
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Ontology-constrained multi-LLM scoring of hypothesis support in the predictive processing literature
Ontology-constrained multi-LLM scoring of hypothesis support in the predictive processing literature
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Finite Element-Based Material Learning via Automatic Differentiation: Learning constitutive neural network models from full-field deformation data
Finite Element-Based Material Learning via Automatic Differentiation: Learning constitutive neural network models from full-field deformation data
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Temporal Preference Concepts and their Functions in a Large Language Model
Temporal Preference Concepts and their Functions in a Large Language Model
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Assessing the Geographic Diversity of AI's Platial Representations in Image Generation
Assessing the Geographic Diversity of AI's Platial Representations in Image Generation
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Geographic Bias and Diversity in AI Evaluation
Geographic Bias and Diversity in AI Evaluation
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
The Granularity Gap: A Multi-Dimensional Longitudinal Audit of Sycophancy in Gemini Models
The Granularity Gap: A Multi-Dimensional Longitudinal Audit of Sycophancy in Gemini Models
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Multi-Granularity Reasoning for Natural Language Inference
Multi-Granularity Reasoning for Natural Language Inference
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
From Scoring to Explanations: Evaluating SHAP and LLM Rationales for Rubric-based Teaching Quality Assessment
From Scoring to Explanations: Evaluating SHAP and LLM Rationales for Rubric-based Teaching Quality Assessment
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
The Virtual Roundtable: Multi-Agent Personas Simulating the Dynamics of Human Brainstorming
The Virtual Roundtable: Multi-Agent Personas Simulating the Dynamics of Human Brainstorming
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
MCBench: A Multicontext Safety Assessment Benchmark for Omni Large Language Models
MCBench: A Multicontext Safety Assessment Benchmark for Omni Large Language Models
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
PEFT of SLM for Telecommunications Customer Support: A Comparative Study of LoRA Configurations with Energy Consumption Analysis
PEFT of SLM for Telecommunications Customer Support: A Comparative Study of LoRA Configurations with Energy Consumption Analysis
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO
Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning
Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics
Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
RAINO: Anchoring Agents in Reality, A Systematic Review and Conceptual Framework for Realism in Agent-Based Modelling
RAINO: Anchoring Agents in Reality, A Systematic Review and Conceptual Framework for Realism in Agent-Based Modelling
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used