cs.LG

该分类下的最新论文

cs.CL 2026-06-17
Freeing the Law with LOCUS: A Local Ordinance Corpus for the United States

Progress in legal AI increasingly depends on access to authoritative legal text at scale. Yet one of the most consequential layers of American law remains largely absent from existing machine-readable...

Denis Peskoff, Joe Barrow, Christopher Vu 等
astro-ph.IM 2026-06-17
The Chandra-Gaia Catalog of Counterparts: Resolving ambiguous Gaia matches to X-ray sources in the Chandra Source Catalo...

We present a framework to cross-match sources from the Chandra Source Catalog (CSC v2.1) with optical sources from Gaia Data Release 3. Unlike purely spatial approaches, we use source properties such ...

V. Samuel Pérez-Díaz, Vinay L. Kashyap, Joshua D. Ingram 等
cs.LG 2026-06-17
UBP2: Uncertainty-Balanced Preference Planning for Efficient Preference-based Reinforcement Learning

Preference-based RL provides an approach to learning reward models from pairwise comparisons of behaviors, bypassing the need for explicit reward design. However, existing methods typically rely on pa...

Mohamed Nabail, Leo Cheng, Jingmin Wang 等
cs.LG 2026-06-17
Explaining Attention with Program Synthesis

A longstanding goal of research on interpretable deep learning is to replace opaque neural computations with human-meaningful symbolic descriptions. In this paper, we propose an approach for approxima...

Amiri Hayes, Belinda Li, Jacob Andreas
cs.LG 2026-06-17
Diffusion-Proof: Recipe for Formal Theorem Proving Beyond Auto-Regressive Generation

Enhancing the formal math reasoning capabilities of Large Language Models (LLMs) has become a key focus in both mathematical and computer science communities in recent years. While significant progres...

Ruida Wang, Rui Pan, Pengcheng Wang 等
cs.LG 2026-06-17
P-K-GCN: Physics-augmented Koopman-enhanced Graph Convolutional Network for Deep Spatiotemporal Super-resolution

High-fidelity simulation of spatiotemporal dynamics is computationally prohibitive, necessitating efficient super-resolution techniques to reconstruct high-resolution data from coarse-grained inputs. ...

Xizhuo, Zhang, Zekai Wang 等
physics.ao-ph 2026-06-17
Optimal scenario design for climate emulation

As deep learning for physical systems continues to grow in popularity, efforts to improve generalizability have primarily focused on designing architectures that embed physical constraints. However, f...

Christopher B. Womack, Shahine Bouabid, Andrei Sokolov 等
cs.CV 2026-06-17
Confidence is Not Reliability: Rethinking MC Dropout in Brain Tumour Segmentation

Glioma segmentation in multiparametric MRI is a critical component of treatment planning. A segmentation model that fails silently on treatment-critical sub-regions represents a patient safety risk th...

Xin Ci Wong, Duygu Sarikaya, Kieran Zucker 等
cs.LG 2026-06-17
Does VLA Even Know the Basics? Measuring Commonsense and World Knowledge Retention in Vision-Language-Action Models

Embodied Vision-Language-Action (VLA) models are typically obtained by fine-tuning powerful pretrained VLMs on robotics data, yet it is unclear how much commonsense and factual knowledge they retain a...

Nikita Kachaev, Andrey Moskalenko, Matvey Skripkin 等
cs.LG 2026-06-17
Risk Stratification for ICU Delirium using Pervasive Ambient Sensing Information

Delirium is a common and serious complication in the Intensive Care Unit (ICU), associated with increased morbidity, prolonged hospital stays, and higher healthcare costs. Despite its prevalence, earl...

Jiaqing Zhang, Sabyasachi Bandyopadhyay, Miguel Contreras 等
cs.AI 2026-06-17
NeSyCat Torch: A Differentiable Tensor Implementation of Categorical Semantics for Neurosymbolic Learning

Neurosymbolic semantics is fragmented: classical, fuzzy, probabilistic and neural systems each define truth by their own inductive rules. NeSyCat, extending ULLER, subsumes them under a single inducti...

Daniel Romero Schellhorn, Till Mossakowski, Björn Gehrke
eess.IV 2026-06-17
Beyond Algorithms: Conceptual Innovation in Medical Imaging AI

Artificial intelligence has driven rapid progress in medical imaging research, producing increasingly sophisticated algorithms and steady improvements on benchmark tasks. However, this algorithm-centr...

Mark A. Anastasio
cs.LG 2026-06-17
Structured Inference with Large Language Gibbs

The knowledge encoded in large language models (LLMs) can serve as a substrate for structured reasoning over variables describing a complex world, but accessing this knowledge in a probabilistically c...

Sanghyeok Choi, Henry Gouk, Esmeralda S. Whitammer
cs.LG 2026-06-17
Detecting Hidden ML Training With Zero-Overhead Telemetry

Hardware-enabled monitoring of GPU workloads underpins many proposals for AI compute governance, but if developers can defeat monitoring mechanisms, such schemes are unworkable. We evaluate the advers...

Robi Rahman, Sabiha Tajdari
cs.LG 2026-06-17
SCAN: Enhance Time Series Anomaly Detection via Multi-Scale Neighborhood-Centered Clustering

Time series anomaly detection plays a crucial role in a wide range of real-world applications. Reconstruction-based methods have become the mainstream paradigm, but they suffer from over-generalizatio...

Xingze Zheng, Hanyin Cheng, Siyuan Wang 等
cs.CV 2026-06-17
OneCanvas: 3D Scene Understanding via Panoramic Reprojection

Existing approaches to 3D scene understanding in Vision-Language Models (VLMs) either rely on complex, model-specific geometry encoders or large training budgets in pursuit of spatial reasoning. Inste...

Bartłomiej Baranowski, Dave Zhenyu Chen, Matthias Nießner
cs.AI 2026-06-17
TxBench-PP: Analyzing AI Agent Performance on Small-Molecule Preclinical Pharmacology

Artificial intelligence (AI) agents promise to accelerate drug discovery by compressing interpretation and decision-making loops, but practical deployment requires trusted evaluation on realistic prog...

Hannah Le, Ramesh Ramasamy, Alex Urrutia 等
cs.LG 2026-06-17
STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability

Reinforcement Learning with Verifiable Rewards algorithms like GRPO have emerged as the dominant post-training paradigm for complex reasoning in LLMs, yet commonly suffer from policy entropy collapse ...

Haipeng Luo, Qingfeng Sun, Songli Wu 等
cs.LG 2026-06-17
A Human-in-the-Loop Bayesian Optimization Framework for Constraint-Aware Bioprocess Development

This work presents an extension to Pareto Front Guided Sampling (PFGS), a Human-in-the-Loop (HitL) Bayesian Optimization (BO) framework in which Gaussian process (GP) surrogate-derived quantities are ...

Samuel Stricker, Claus Wirnsperger, Alessandro Butté 等
cs.LG 2026-06-17
Mechanism-Guided Selective Unlearning for RLVR-Induced Reasoning

We propose MAST (Mechanism-Aligned Selective Targeting), a mechanism-guided method for unlearning RLVR-induced reasoning with substantially lower collateral damage than standard full-parameter updates...

Chenyu Zhou, Qiliang Jiang, Shuning Wu 等
cs.LG 2026-06-17
Machine Unlearning for the XGBoost Model with Network Intrusion Datasets

Machine Unlearning (MU) has emerged as an important technique for removing specific data points from trained models without requiring full retraining. However, most existing MU research focuses on dee...

Diana Magalhães, Eva Maia, João Vitorino 等
stat.ML 2026-06-17
Generalised Eigenvalue Geometry of Semantic Adversarial Attacks

Recent empirical work shows that semantically equivalent paraphrases can fool financial sentiment classifiers: although a paraphrase remains close to the original under a strong reference embedding, i...

Martin Anthony, Kaveh Salehzadeh Nobari
cs.LG 2026-06-17
Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods

Stochastic momentum methods such as heavy ball (HB), Nesterov momentum, and variants of Accelerated SGD (ASGD) [Kidambi et al., 2018] are widely used in modern training, but their stochastic benefits ...

Depen Morwani, Alexandru Meterez, Pranav Nair 等
stat.ML 2026-06-17
On Local Population-Risk Certificates

This paper develops local certificates for population-risk increments around a current model. For a local candidate set \(\mathcal D\), the certificate is a two-sided confidence band for \(P({\ell_{θ+...

Mingzhi Song
cs.LG 2026-06-17
INDEQS: Informed Neural controlled Differential EQuationS

Neural Controlled Differential Equations (NCDE) provide a powerful continuous-time framework for forecasting time series, but standard graph-based extensions typically learn spatial structure purely f...

Michael Detzel, Gabriel Nobis, Kristiyan Blagov 等
stat.ME 2026-06-17
Wasserstein Policy Learning for Distributional Outcomes

Offline policy learning has received growing attention in causal inference. The primary objective is to learn a policy (individualized treatment rule) as a mapping from covariates to treatment that ma...

Yiyan Huang, Cheuk Hang Leung, Qi Wu 等
cs.LG 2026-06-17
Smoothness-Based Derandomization of PAC-Bayes Bounds

We study PAC-Bayes derandomization for smooth loss functions. Our goal is to obtain generalization bounds that hold with high probability for deterministic predictors by exploiting smoothness properti...

Alexandre Lemire Paquin, Brahim Chaib-Draa, Philippe Giguère
stat.ML 2026-06-17
Quantifying and Auditing LLM Evaluation via Positive--Unlabeled Learning

Large Language Models (LLMs) are increasingly used as judges for scalable evaluation, yet such LLM--as--a--Judge systems exhibit systematic biases that are decoupled from semantic quality, most notabl...

Zilong Zhang, Yi-Ting Hung, Lei Ding 等
stat.ML 2026-06-17
Sequential Kernel-based Conditional Independence Testing via Adaptive Betting

Testing conditional independence is fundamental yet intrinsically difficult: without additional assumptions, Type I error control is impossible in general. The "Model-X'' paradigm addresses this diffi...

Zheng He, Danica J. Sutherland
stat.ML 2026-06-17
FOSC-X: An Extended Framework for Optimal Local Cuts and Non-Horizontal Cluster Selection from Clustering Hierarchies

Extracting a flat clustering solution from a hierarchy is a common task in practical cluster analysis and can be formulated as an optimisation problem. Existing approaches focus on finding a single op...

Connor Simpson, Ricardo J. G. B. Campello
cs.LG 2026-06-17
Strategic Feature Selection

When algorithmic predictors inform resource allocation in high-stakes domains such as healthcare, these predictors must account for strategic manipulation of input features. The typical solution is to...

Jivat Neet Kaur, Pratik Patil, Divya Shanmugam 等
stat.ML 2026-06-17
Kernel of Partition Paths: A Unified Representation for Tree Ensembles

A recent line of work has reframed individual decision trees as linear models on engineered features associated with their splits, opening routes for oracle inequalities and feature-importance reinter...

Nicolas Mahler
cs.LG 2026-06-17
Online Distributional Prediction via Latent Cluster Geometry Under Drift and Corruption

Online learning in non-stationary streams is often formulated as tracking a point estimate, but many applications require predicting the full data-generating distribution. We study online distribution...

Navyansh Mahla, Prateek Chanda, Ganesh Ramakrishnan
stat.ML 2026-06-17
TimeLAVA: Learning-Agnostic Data Valuation for Time Series

Data valuation quantifies the intrinsic quality of individual samples to enable principled data curation, quality control, and robust learning. For time series in critical domains such as healthcare, ...

Wenqin Liu, Weizhi Quan, Aoqi Zuo 等