Controllable protein design with particle-based Feynman-Kac steering
Authors: Erik Hartman, Jonas Wallin, Johan Malmström, Jimmy Olsson
First: 2025-11-12T11:23:44+00:00 · Latest: 2026-04-06T16:44:51+00:00
Comments: In version 2 we added an experiment on improving designability through steering towards lower delta G
Abstract
Proteins underpin most biological function, and the ability to design them with tailored structures and properties is central to advances in biotechnology. Diffusion-based generative models have emerged as powerful tools for protein design, but steering them toward proteins with specified properties remains challenging. The Feynman-Kac (FK) framework provides a principled way to guide diffusion models using user-defined rewards. In this paper, we enable FK-based steering of RFdiffusion through the development of guiding potentials that leverage ProteinMPNN and structural relaxation to guide the diffusion process towards desired properties. We show that steering can be used to consistently improve predicted interface energetics and increase binder designability by $89.5\%$. Together, these results establish that diffusion-based protein design can be effectively steered toward arbitrary, non-differentiable objectives, providing a model-independent framework for controllable protein generation.
Summary / 总结
Proteins underpin most biological function, and the ability to design them with tailored structures and properties is central to advances in biotechnology.
Energy Decay Network (EDeN)
Authors: Jamie Nicholas Shelley, Optishell Consultancy
First: 2021-03-10T23:17:59+00:00 · Latest: 2026-04-06T15:04:36+00:00
Comments: Added section on temporal eligility + added edits to cem processing (removed sigmod pass)
Abstract
This paper and accompanying Python and C++ Framework is the product of the authors perceived problems with narrow (Discrimination based) AI. (Artificial Intelligence) The Framework attempts to develop a genetic transfer of experience through potential structural expressions using a common regulation/exchange value (energy) to create a model whereby neural architecture and all unit processes are co-dependently developed by genetic and real time signal processing influences; successful routes are defined by stability of the spike distribution per epoch which is influenced by genetically encoded morphological development biases.These principles are aimed towards creating a diverse and robust network that is capable of adapting to general tasks by training within a simulation designed for transfer learning to other mediums at scale.
Summary / 总结
This paper and accompanying Python and C++ Framework is the product of the authors perceived problems with narrow (Discrimination based) AI.
LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios
Authors: Zhiyuan Huang, Jiahao Chen, Bing Su
First: 2025-09-12T02:28:32+00:00 · Latest: 2026-04-06T11:46:17+00:00
Abstract
Long-tailed semi-supervised learning (LTSSL) presents a formidable challenge where models must overcome the scarcity of tail samples while mitigating the noise from unreliable pseudo-labels. Most prior LTSSL methods are designed to train models from scratch, which often leads to issues such as overconfidence and low-quality pseudo-labels. To address this problem, we first theoretically prove that utilizing a foundation model significantly reduces the hypothesis complexity, which tightens the generalization bound and in turn minimizes the Balanced Posterior Error (BPE). Furthermore, we demonstrate that the feature compactness of foundation models strictly compresses the acceptance region for outliers, providing a geometric guarantee for robustness. Motivated by these theoretical insights, we extend LTSSL into the foundation model fine-tuning paradigm and propose a novel framework: LoFT (Long-tailed semi-supervised learning via parameter-efficient Fine-Tuning). Furthermore, we explore a more practical setting by investigating semi-supervised learning under open-world conditions, where the unlabeled data may include out-of-distribution (OOD) samples.To handle this problem, we propose LoFT-OW (LoFT under Open-World scenarios) to improve the discriminative ability. Experimental results on multiple benchmarks demonstrate that our method achieves superior performance.
Summary / 总结
Long-tailed semi-supervised learning (LTSSL) presents a formidable challenge where models must overcome the scarcity of tail samples while mitigating the noise from unreliable pseudo-labels.
Federated Transfer Learning with Differential Privacy
Authors: Mengchu Li, Ye Tian, Yang Feng, Yi Yu
First: 2024-03-17T21:04:48+00:00 · Latest: 2026-04-06T10:44:11+00:00
Comments: 101 pages, 7 figures
Abstract
Federated learning has emerged as a powerful framework for analysing distributed data, yet two challenges remain pivotal: heterogeneity across sites and privacy of local data. In this paper, we address both challenges within a federated transfer learning framework, aiming to enhance learning on a target data set by leveraging information from multiple heterogeneous source data sets while adhering to privacy constraints. We rigorously formulate the notion of federated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server. Under this privacy model, we study four statistical problems: univariate mean estimation, low-dimensional linear regression, high-dimensional linear regression, and M-estimation. By investigating the minimax rates and quantifying the cost of privacy, we show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy. Our analyses account for data heterogeneity and privacy, highlighting the fundamental costs associated with each factor and the benefits of knowledge transfer in federated learning.
Summary / 总结
Federated learning has emerged as a powerful framework for analysing distributed data, yet two challenges remain pivotal: heterogeneity across sites and privacy of local data.
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor
Authors: Jingyi Yang, Yuxian Jiang, Xuhao Hu, Shuang Cheng, Biqing Qi, Jing Shao
First: 2026-04-05T18:13:19+00:00 · Latest: 2026-04-05T18:13:19+00:00
Comments: 14 pages,3 figures,5 tables
Abstract
Diffusion large language models (dLLMs) are emerging as a compelling alternative to dominant autoregressive models, replacing strictly sequential token generation with iterative denoising and parallel generation dynamics. However, their open-source ecosystem remains fragmented across model families and, in particular, across post-training pipelines, where reinforcement learning objectives, rollout implementations and evaluation scripts are often released as paper-specific codebases. This fragmentation slows research iteration, raises the engineering burden of reproduction, and makes fair comparison across algorithms difficult. We present \textbf{DARE} (\textbf{d}LLMs \textbf{A}lignment and \textbf{R}einforcement \textbf{E}xecutor), an open framework for post-training and evaluating dLLMs. Built on top of verl~\cite{sheng2024hybridflow} and OpenCompass~\cite{2023opencompass}, DARE unifies supervised fine-tuning, parameter-efficient fine-tuning, preference optimization, and dLLM-specific reinforcement learning under a shared execution stack for both masked and block diffusion language models. Across representative model families including LLaDA, Dream, SDAR, and LLaDA2.x, DARE provides broad algorithmic coverage, reproducible benchmark evaluation, and practical acceleration. Extensive empirical results position that DARE serves as a reusable research substrate for developing, comparing, and deploying post-training methods for current and emerging dLLMs.
Summary / 总结
Diffusion large language models (dLLMs) are emerging as a compelling alternative to dominant autoregressive models, replacing strictly sequential token generation with iterative denoising and parallel generation dynamics.
A Semi-Automated Annotation Workflow for Paediatric Histopathology Reports Using Small Language Models
Authors: Avish Vijayaraghavan, Jaskaran Singh Kawatra, Sebin Sabu, Jonny Sheldon, Will Poulett, Alex Eze, Daniel Key, John Booth, Shiren Patel, Jonny Pearson, Dan Schofield, Jonathan Hope, Pavithra Rajendran, Neil Sebire
First: 2026-04-05T16:15:39+00:00 · Latest: 2026-04-05T16:15:39+00:00
Comments: 36 pages, includes supplementary information
Abstract
Electronic Patient Record (EPR) systems contain valuable clinical information, but much of it is trapped in unstructured text, limiting its use for research and decision-making. Large language models can extract such information but require substantial computational resources to run locally, and sending sensitive clinical data to cloud-based services, even when deidentified, raises significant patient privacy concerns. In this study, we develop a resource-efficient semi-automated annotation workflow using small language models (SLMs) to extract structured information from unstructured EPR data, focusing on paediatric histopathology reports. As a proof-of-concept, we apply the workflow to paediatric renal biopsy reports, a domain chosen for its constrained diagnostic scope and well-defined underlying biology. We develop the workflow iteratively with clinical oversight across three meetings, manually annotating 400 reports from a dataset of 2,111 at Great Ormond Street Hospital as a gold standard, while developing an automated information extraction approach using SLMs. We frame extraction as a Question-Answering task grounded by clinician-guided entity guidelines and few-shot examples, evaluating five instruction-tuned SLMs with a disagreement modelling framework to prioritise reports for clinical review. Gemma 2 2B achieves the highest accuracy at 84.3%, outperforming off-the-shelf models including spaCy (74.3%), BioBERT-SQuAD (62.3%), RoBERTa-SQuAD (59.7%), and GLiNER (60.2%). Entity guidelines improved performance by 7-19% over the zero-shot baseline, and few-shot examples by 6-38%, though their benefits do not compound when combined. These results demonstrate that SLMs can extract structured information from specialised clinical domains on CPU-only infrastructure with minimal clinician involvement. Our code is available at https://github.com/gosh-dre/nlp_renal_biopsy.
Summary / 总结
Electronic Patient Record (EPR) systems contain valuable clinical information, but much of it is trapped in unstructured text, limiting its use for research and decision-making.
SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling
Authors: Camile Lendering, Erkut Akdag, Egor Bondarev
Venue: CVPR 2026
First: 2026-02-26T13:52:57+00:00 · Latest: 2026-04-05T15:06:06+00:00
Comments: Accepted to CVPR 2026
Abstract
Detecting visual anomalies in industrial inspection often requires training with only a few normal images per category. Recent few-shot methods achieve strong results employing foundation-model features, but typically rely on memory banks, auxiliary datasets, or multi-modal tuning of vision-language models. We therefore question whether such complexity is necessary given the feature representations of vision foundation models. To answer this question, we introduce SubspaceAD, a training-free method, that operates in two simple stages. First, patch-level features are extracted from a small set of normal images by a frozen DINOv2 backbone. Second, a Principal Component Analysis (PCA) model is fit to these features to estimate the low-dimensional subspace of normal variations. At inference, anomalies are detected via the reconstruction residual with respect to this subspace, producing interpretable and statistically grounded anomaly scores. Despite its simplicity, SubspaceAD achieves state-of-the-art performance across one-shot and few-shot settings without training, prompt tuning, or memory banks. In the one-shot anomaly detection setting, SubspaceAD achieves image-level and pixel-level AUROC of 97.1% and 97.5% on the MVTec-AD dataset, and 93.4% and 98.2% on the VisA dataset, respectively, surpassing prior state-of-the-art results. Code and demo are available at https://github.com/CLendering/SubspaceAD.
Summary / 总结
Detecting visual anomalies in industrial inspection often requires training with only a few normal images per category.
Learning Robust Visual Features in Computed Tomography Enables Efficient Transfer Learning for Clinical Tasks
Authors: Rubén Moreno-Aguado, Alba Magallón, Victor Moreno, Yingying Fang, Guang Yang
First: 2026-04-05T14:29:35+00:00 · Latest: 2026-04-05T14:29:35+00:00
Abstract
There is substantial interest in developing artificial intelligence systems to support radiologists across tasks ranging from segmentation to report generation. Existing computed tomography (CT) foundation models have largely focused on building generalist vision-language systems capable of tasks such as question answering and report generation. However, training reliable vision-language systems requires paired image-text data at a scale that remains unavailable in CT. Moreover, adapting the underlying visual representations to downstream tasks typically requires partial or full backbone fine-tuning, a computationally demanding process inaccessible to many research groups. Instead, foundation models should prioritise learning robust visual representations that enable efficient transfer to new tasks with minimal labelled data and without backbone fine-tuning. We present VoxelFM, a 3D CT foundation model trained with self-distillation using the DINO framework, which learns semantically rich features without language supervision. We evaluated VoxelFM across seven categories of clinically relevant downstream tasks using frozen backbone representations with lightweight probes: classification, regression, survival analysis, instance retrieval, localisation, segmentation, and report generation. VoxelFM matched or outperformed four existing CT foundation models across all task categories. Despite receiving no language supervision during pre-training, VoxelFM surpassed models explicitly trained with language-alignment objectives, including on report generation. Our results indicate that current CT foundation models perform significantly better as feature extractors for lightweight probes rather than as vision encoders for vision-language models. Model weights and training code are publicly available.
Summary / 总结
There is substantial interest in developing artificial intelligence systems to support radiologists across tasks ranging from segmentation to report generation.
Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval
Authors: Ilyass Moummad, Marius Miron, David Robinson, Kawtar Zaher, Hervé Goëau, Olivier Pietquin, Pierre Bonnet, Emmanuel Chemla, Matthieu Geist, Alexis Joly
First: 2026-01-30T10:05:32+00:00 · Latest: 2026-04-05T12:19:08+00:00
Abstract
Large-scale biodiversity monitoring platforms increasingly rely on multimodal wildlife observations. While recent foundation models enable rich semantic representations across vision, audio, and language, retrieving relevant observations from massive archives remains challenging due to the computational cost of high-dimensional similarity search. In this work, we introduce compact hypercube embeddings for fast text-based wildlife observation retrieval, a framework that enables efficient text-based search over large-scale wildlife image and audio databases using compact binary representations. Building on the cross-view code alignment hashing framework, we extend lightweight hashing beyond a single-modality setup to align natural language descriptions with visual or acoustic observations in a shared Hamming space. Our approach leverages pretrained wildlife foundation models, including BioCLIP and BioLingual, and adapts them efficiently for hashing using parameter-efficient fine-tuning. We evaluate our method on large-scale benchmarks, including iNaturalist2024 for text-to-image retrieval and iNatSounds2024 for text-to-audio retrieval, as well as multiple soundscape datasets to assess robustness under domain shift. Results show that retrieval using discrete hypercube embeddings achieves competitive, and in several cases superior, performance compared to continuous embeddings, while drastically reducing memory and search cost. Moreover, we observe that the hashing objective consistently improves the underlying encoder representations, leading to stronger retrieval and zero-shot generalization. These results demonstrate that binary, language-based retrieval enables scalable and efficient search over large wildlife archives for biodiversity monitoring systems.
Summary / 总结
Large-scale biodiversity monitoring platforms increasingly rely on multimodal wildlife observations.
Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization
Authors: Xuelin Zhang, Hong Chen, Bin Gu, Tieliang Gong, Feng Zheng
First: 2026-04-05T12:12:58+00:00 · Latest: 2026-04-05T12:12:58+00:00
Abstract
Stochastic bilevel optimization (SBO) has been integrated into many machine learning paradigms recently, including hyperparameter optimization, meta learning, and reinforcement learning. Along with the wide range of applications, there have been numerous studies on the computational behavior of SBO. However, the generalization guarantees of SBO methods are far less understood from the lens of statistical learning theory. In this paper, we provide a systematic generalization analysis of the first-order gradient-based bilevel optimization methods. Firstly, we establish the quantitative connections between the on-average argument stability and the generalization gap of SBO methods. Then, we derive the upper bounds of on-average argument stability for single-timescale stochastic gradient descent (SGD) and two-timescale SGD, where three settings (nonconvex-nonconvex (NC-NC), convex-convex (C-C), and strongly-convex-strongly-convex (SC-SC)) are considered respectively. Experimental analysis validates our theoretical findings. Compared with the previous algorithmic stability analysis, our results do not require reinitializing the inner-level parameters at each iteration and are applicable to more general objective functions.
Summary / 总结
Stochastic bilevel optimization (SBO) has been integrated into many machine learning paradigms recently, including hyperparameter optimization, meta learning, and reinforcement learning.
Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison
Authors: Jihoon Jeong
First: 2026-04-05T11:09:27+00:00 · Latest: 2026-04-05T11:09:27+00:00
Comments: 14 pages, 4 figures, 7 tables. Paper #6 in the Model Medicine series
Abstract
Small language models (SLMs) in the 100M-10B parameter range increasingly power production systems, yet whether they possess the internal emotion representations recently discovered in frontier models remains unknown. We present the first comparative analysis of emotion vector extraction methods for SLMs, evaluating 9 models across 5 architectural families (GPT-2, Gemma, Qwen, Llama, Mistral) using 20 emotions and two extraction methods (generation-based and comprehension-based). Generation-based extraction produces statistically superior emotion separation (Mann-Whitney p = 0.007; Cohen's d = -107.5), with the advantage modulated by instruction tuning and architecture. Emotion representations localize at middle transformer layers (~50% depth), following a U-shaped curve that is architecture-invariant from 124M to 3B parameters. We validate these findings against representational anisotropy baselines across 4 models and confirm causal behavioral effects through steering experiments, independently verified by an external emotion classifier (92% success rate, 37/40 scenarios). Steering reveals three regimes -- surgical (coherent text transformation), repetitive collapse, and explosive (text degradation) -- quantified by perplexity ratios and separated by model architecture rather than scale. We document cross-lingual emotion entanglement in Qwen, where steering activates semantically aligned Chinese tokens that RLHF does not suppress, raising safety concerns for multilingual deployment. This work provides methodological guidelines for emotion research on open-weight models and contributes to the Model Medicine series by bridging external behavioral profiling with internal representational analysis.
Summary / 总结
Small language models (SLMs) in the 100M-10B parameter range increasingly power production systems, yet whether they possess the internal emotion representations recently discovered in frontier models remains unknown.
SwiftGS: Episodic Priors for Immediate Satellite Surface Recovery
Authors: Rong Fu, Jiekai Wu, Haiyun Wei, Xiaowen Ma, Shiyin Lin, Kangan Qian, Chuang Liu, Jianyuan Ni, Simon James Fong
First: 2026-03-19T08:59:07+00:00 · Latest: 2026-04-05T06:17:36+00:00
Comments: 23 pages, 6 figures
Abstract
Rapid, large-scale 3D reconstruction from multi-date satellite imagery is vital for environmental monitoring, urban planning, and disaster response, yet remains difficult due to illumination changes, sensor heterogeneity, and the cost of per-scene optimization. We introduce SwiftGS, a meta-learned system that reconstructs 3D surfaces in a single forward pass by predicting geometry-radiation-decoupled Gaussian primitives together with a lightweight SDF, replacing expensive per-scene fitting with episodic training that captures transferable priors. The model couples a differentiable physics graph for projection, illumination, and sensor response with spatial gating that blends sparse Gaussian detail and global SDF structure, and incorporates semantic-geometric fusion, conditional lightweight task heads, and multi-view supervision from a frozen geometric teacher under an uncertainty-aware multi-task loss. At inference, SwiftGS operates zero-shot with optional compact calibration and achieves accurate DSM reconstruction and view-consistent rendering at significantly reduced computational cost, with ablations highlighting the benefits of the hybrid representation, physics-aware rendering, and episodic meta-training.
Summary / 总结
Rapid, large-scale 3D reconstruction from multi-date satellite imagery is vital for environmental monitoring, urban planning, and disaster response, yet remains difficult due to illumination changes, sensor heterogeneity, and the cost of per-scene optimization.
Explainability-Guided Adversarial Attacks on Transformer-Based Malware Detectors Using Control Flow Graphs
Authors: Andrew Wheeler, Kshitiz Aryal, Maanak Gupta
First: 2026-04-04T19:50:04+00:00 · Latest: 2026-04-04T19:50:04+00:00
Comments: 9 pages, 3 figures, 4 tables, 1 algorithm, 2 equations
Abstract
Transformer-based malware detection systems operating on graph modalities such as control flow graphs (CFGs) achieve strong performance by modeling structural relationships in program behavior. However, their robustness to adversarial evasion attacks remains underexplored. This paper examines the vulnerability of a RoBERTa-based malware detector that linearizes CFGs into sequences of function calls, a design choice that enables transformer modeling but may introduce token-level sensitivities and ordering artifacts exploitable by adversaries. By evaluating evasion strategies within this graph-to-sequence framework, we provide insight into the practical robustness of transformer-based malware detectors beyond aggregate detection accuracy.
This paper proposes a white-box adversarial evasion attack that leverages explainability mechanisms to identify and perturb most influential graph components. Using token- and word-level attributions derived from integrated gradients, the attack iteratively replaces positively attributed function calls with synthetic external imports, producing adversarial CFG representations without altering overall program structure. Experimental evaluation on small- and large-scale Windows Portable Executable (PE) datasets demonstrates that the proposed method can reliably induce misclassification, even against models trained to high accuracy. Our results highlight that explainability tools, while valuable for interpretability, can also expose critical attack surfaces in transformer-based malware detectors.
Summary / 总结
Transformer-based malware detection systems operating on graph modalities such as control flow graphs (CFGs) achieve strong performance by modeling structural relationships in program behavior.
Process-Informed Forecasting of Complex Thermal Dynamics in Pharmaceutical Manufacturing
Authors: Ramona Rubini, Siavash Khodakarami, Aniruddha Bora, George Em Karniadakis, Michele Dassisti
First: 2025-09-24T17:42:00+00:00 · Latest: 2026-04-04T17:53:03+00:00
Abstract
Accurate time-series forecasting for complex physical systems is the backbone of modern industrial monitoring and control, yet deep learning models often lack the physical consistency required in regulated environments. To bridge this gap, we introduce Process-Informed Forecasting (PIF) models for temperature in pharmaceutical lyophilization, embedding deterministic production recipes as macro-structural priors. We investigate classical methods (e.g., Autoregressive Integrated Moving Average (ARIMA) model) and modern deep learning architectures, including Kolmogorov-Arnold Networks (KANs). We compare three different loss function formulations that integrate a process-informed trajectory prior: a fixed-weight loss, a dynamic uncertainty-based loss, and a Residual-Based Attention (RBA) mechanism. We evaluate all models not only for accuracy and physical consistency but also for robustness to sensor noise. Furthermore, we test the practical generalizability of the best model in a transfer learning scenario on a new process. Our results show that PIF models outperform their data-driven counterparts in terms of accuracy, physical plausibility and noise resilience, offering a scalable framework for reliable and generalizable forecasting solutions in critical manufacturing.
Summary / 总结
Accurate time-series forecasting for complex physical systems is the backbone of modern industrial monitoring and control, yet deep learning models often lack the physical consistency required in regulated environments.
Reinforcement Learning from Human Feedback
Authors: Nathan Lambert
First: 2025-04-16T21:36:46+00:00 · Latest: 2026-04-04T15:50:42+00:00
Comments: 220 pages. Web-native version at https://rlhfbook.com/ Continually improving, latest version at website
Abstract
Reinforcement learning from human feedback (RLHF) has become an important technical and storytelling tool to deploy the latest machine learning systems. In this book, we hope to give a gentle introduction to the core methods for people with some level of quantitative background. The book starts with the origins of RLHF -- both in recent literature and in a convergence of disparate fields of science in economics, philosophy, and optimal control. We then set the stage with definitions, problem formulation, data collection, and other common math used in the literature. The core of the book details every optimization stage in using RLHF, from starting with instruction tuning to training a reward model and finally all of rejection sampling, reinforcement learning, and direct alignment algorithms. The book concludes with advanced topics -- understudied research questions in synthetic data and evaluation -- and open questions for the field.
Summary / 总结
Reinforcement learning from human feedback (RLHF) has become an important technical and storytelling tool to deploy the latest machine learning systems.
Predicting Intermittent Job Failure Categories for Diagnosis Using Few-Shot Fine-Tuned Language Models
Authors: Henri Aïdasso, Francis Bordeleau, Ali Tizghadam
First: 2026-01-29T19:34:34+00:00 · Latest: 2026-04-04T04:37:18+00:00
Comments: Accepted at the ACM International Conference on the Foundations of Software Engineering (FSE 2026), Industry Track
Abstract
In principle, Continuous Integration (CI) pipeline failures provide valuable feedback to developers on code-related errors. In practice, however, pipeline jobs often fail intermittently due to non-deterministic tests, network outages, infrastructure failures, resource exhaustion, and other reliability issues. These intermittent (flaky) job failures lead to substantial inefficiencies: wasted computational resources from repeated reruns and significant diagnosis time that distracts developers from core activities and often requires intervention from specialized teams. Prior work has proposed machine learning techniques to detect intermittent failures, but does not address the subsequent diagnosis challenge. To fill this gap, we introduce FlaXifyer, a few-shot learning approach for predicting intermittent job failure categories using pre-trained language models. FlaXifyer requires only job execution logs and achieves 84.3% Macro F1 and 92.0% Top-2 accuracy with just 12 labeled examples per category. We also propose LogSift, an interpretability technique that identifies influential log statements in under one second, reducing review effort by 74.4% while surfacing relevant failure information in 87% of cases. Evaluation on 2,458 job failures from TELUS demonstrates that FlaXifyer and LogSift enable effective automated triage, accelerate failure diagnosis, and pave the way towards the automated resolution of intermittent job failures.
Summary / 总结
In principle, Continuous Integration (CI) pipeline failures provide valuable feedback to developers on code-related errors.
Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Layer Governability in Large Language Models
Authors: Gregory M. Ruddell
First: 2026-04-04T00:08:17+00:00 · Latest: 2026-04-04T00:08:17+00:00
Comments: Extends arXiv:2603.21415. 30 pages. Also available on Zenodo (10.5281/zenodo.19393882)
Abstract
Current AI safety relies on behavioral monitoring and post-training alignment, yet empirical measurement shows these approaches produce no detectable pre-commitment signal in a majority of instruction-tuned models tested. We present an energy-based governance framework connecting transformer inference dynamics to constraint-satisfaction models of neural computation, and apply it to a seven-model cohort across five geometric regimes.
Using trajectory tension (rho = ||a|| / ||v||), we identify a 57-token pre-commitment window in Phi-3-mini-4k-instruct under greedy decoding on arithmetic constraint probes. This result is model-specific, task-specific, and configuration-specific, demonstrating that pre-commitment signals can exist but are not universal.
We introduce a five-regime taxonomy of inference behavior: Authority Band, Late Signal, Inverted, Flat, and Scaffold-Selective. Energy asymmetry (Σ\r{ho}_misaligned / Σ\r{ho}_aligned) serves as a unifying metric of structural rigidity across these regimes.
Across seven models, only one configuration exhibits a predictive signal prior to commitment; all others show silent failure, late detection, inverted dynamics, or flat geometry.
We further demonstrate that factual hallucination produces no predictive signal across 72 test conditions, consistent with spurious attractor settling in the absence of a trained world-model constraint.
These results establish that rule violation and hallucination are distinct failure modes with different detection requirements. Internal geometry monitoring is effective only where resistance exists; detection of factual confabulation requires external verification mechanisms.
This work provides a measurable framework for inference-layer governability and introduces a taxonomy for evaluating deployment risk in autonomous AI systems.
Summary / 总结
Current AI safety relies on behavioral monitoring and post-training alignment, yet empirical measurement shows these approaches produce no detectable pre-commitment signal in a majority of instruction-tuned models tested.
Robust Adaptation of Foundation Models with Black-Box Visual Prompting
Authors: Changdae Oh, Gyeongdeok Seo, Geunyoung Jung, Zhi-Qi Cheng, Hosik Choi, Jiyoung Jung, Kyungwoo Song
First: 2024-07-04T02:35:00+00:00 · Latest: 2026-04-03T23:10:26+00:00
Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2026
Abstract
With a surge of large-scale pre-trained models, parameter-efficient transfer learning (PETL) of large models has garnered significant attention. While promising, they commonly rely on two optimistic assumptions: 1) full access to the parameters of a PTM, and 2) sufficient memory capacity to cache all intermediate activations for gradient computation. However, in most real-world applications, PTMs serve as black-box APIs or proprietary software without full parameter accessibility. Besides, it is hard to meet a large memory requirement for modern PTMs. This work proposes black-box visual prompting (BlackVIP), which efficiently adapts the PTMs without knowledge of their architectures or parameters. BlackVIP has two components: 1) Coordinator and 2) simultaneous perturbation stochastic approximation with gradient correction (SPSA-GC). The Coordinator designs input-dependent visual prompts, which allow the target PTM to adapt in the wild. SPSA-GC efficiently estimates the gradient of PTM to update Coordinator. Besides, we introduce a variant, BlackVIP-SE, which significantly reduces the runtime and computational cost of BlackVIP. Extensive experiments on 19 datasets demonstrate that BlackVIPs enable robust adaptation to diverse domains and tasks with minimal memory requirements. We further provide a theoretical analysis on the generalization of visual prompting methods by presenting their connection to the certified robustness of randomized smoothing, and presenting an empirical support for improved robustness.
Summary / 总结
With a surge of large-scale pre-trained models, parameter-efficient transfer learning (PETL) of large models has garnered significant attention.
Neural Operators for Multi-Task Control and Adaptation
Authors: David Sewell, Xingjian Li, Stepan Tretiakov, Krishna Kumar, David Fridovich-Keil
First: 2026-04-03T20:45:32+00:00 · Latest: 2026-04-03T20:45:32+00:00
Comments: 25 pages, 10 figures, 2 tables
Abstract
Neural operator methods have emerged as powerful tools for learning mappings between infinite-dimensional function spaces, yet their potential in optimal control remains largely unexplored. We focus on multi-task control problems, whose solution is a mapping from task description (e.g., cost or dynamics functions) to optimal control law (e.g., feedback policy). We approximate these solution operators using a permutation-invariant neural operator architecture. Across a range of parametric optimal control environments and a locomotion benchmark, a single operator trained via behavioral cloning accurately approximates the solution operator and generalizes to unseen tasks, out-of-distribution settings, and varying amounts of task observations. We further show that the branch-trunk structure of our neural operator architecture enables efficient and flexible adaptation to new tasks. We develop structured adaptation strategies ranging from lightweight updates to full-network fine-tuning, achieving strong performance across different data and compute settings. Finally, we introduce meta-trained operator variants that optimize the initialization for few-shot adaptation. These methods enable rapid task adaptation with limited data and consistently outperform a popular meta-learning baseline. Together, our results demonstrate that neural operators provide a unified and efficient framework for multi-task control and adaptation.
Summary / 总结
Neural operator methods have emerged as powerful tools for learning mappings between infinite-dimensional function spaces, yet their potential in optimal control remains largely unexplored.
Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models
Authors: Yukun Huang, Sanxing Chen, Jian Pei, Manzil Zaheer, Bhuwan Dhingra
First: 2025-06-21T04:48:05+00:00 · Latest: 2026-04-03T18:53:25+00:00
Abstract
Trustworthy language models should provide both correct and verifiable answers. However, citations generated directly by standalone LLMs are often unreliable. As a result, current systems insert citations by querying an external retriever at inference time, introducing latency, infrastructure dependence, and vulnerability to retrieval noise. We explore whether LLMs can be made to reliably attribute to the documents seen during continual pretraining without test-time retrieval, by revising the training process. To study this, we construct CitePretrainBench, a benchmark that mixes real-world corpora (Wikipedia, Common Crawl, arXiv) with novel documents and probes both short-form (single-fact) and long-form (multi-fact) citation tasks. Our approach follows a two-stage process: (1) continual pretraining to index factual knowledge by binding it to persistent document identifiers; and (2) instruction tuning to elicit citation behavior. We introduce Active Indexing for the first stage, which creates generalizable, source-anchored bindings by augmenting training with synthetic data that (i) restate each fact in diverse, compositional forms and (ii) enforce bidirectional training (source-to-fact and fact-to-source). This equips the model to both generate content from a cited source and attribute its own answers, improving robustness to paraphrase and composition. Experiments with Qwen-2.5-7B&3B show that Active Indexing consistently outperforms a Passive Indexing baseline, which simply appends an identifier to each document, achieving citation precision gains of up to 30.2% across all tasks and models. Our ablation studies reveal that performance continues to improve as we scale the amount of augmented data, showing a clear upward trend even at 16x the original token count. Finally, we show that internal citations complement external ones by making the model more robust to retrieval noise.
Summary / 总结
Trustworthy language models should provide both correct and verifiable answers.
Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters
Authors: Haotian Xiang, Bingcong Li, Qin Lu
First: 2026-04-03T18:42:04+00:00 · Latest: 2026-04-03T18:42:04+00:00
Abstract
When deploying large language models (LLMs) to safety-critical applications, uncertainty quantification (UQ) is of utmost importance to self-assess the reliability of the LLM-based decisions. However, such decisions typically suffer from overconfidence, particularly after parameter-efficient fine-tuning (PEFT) for downstream domain-specific tasks with limited data. Existing methods to alleviate this issue either rely on Laplace approximation based post-hoc framework, which may yield suboptimal calibration depending on the training trajectory, or variational Bayesian training that requires multiple complete forward passes through the entire LLM backbone at inference time for Monte Carlo estimation, posing scalability challenges for deployment. To address these limitations, we build on the Bayesian last layer (BLL) model, where the LLM-based deterministic feature extractor is followed by random last layer parameters for uncertainty reasoning. Since existing low-rank adapters (LoRA) for PEFT have limited expressiveness due to rank collapse, we address this with Polar-decomposed Low-rank Adapter Representation (PoLAR), an orthogonalized parameterization paired with Riemannian optimization to enable more stable and expressive adaptation. Building on this PoLAR-BLL model, we leverage the variational (V) inference framework to put forth a scalable Bayesian fine-tuning approach which jointly seeks the PoLAR parameters and approximate posterior of the last layer parameters via alternating optimization. The resulting PoLAR-VBLL is a flexible framework that nicely integrates architecture-enhanced optimization with scalable Bayesian inference to endow LLMs with well-calibrated UQ. Our empirical results verify the effectiveness of PoLAR-VBLL in terms of generalization and uncertainty estimation on both in-distribution and out-of-distribution data for various common-sense reasoning tasks.
Summary / 总结
When deploying large language models (LLMs) to safety-critical applications, uncertainty quantification (UQ) is of utmost importance to self-assess the reliability of the LLM-based decisions.
VERT: Reliable LLM Judges for Radiology Report Evaluation
Authors: Federica Bologna, Jean-Philippe Corbeil, Matthew Wilkens, Asma Ben Abacha
First: 2026-04-03T18:10:21+00:00 · Latest: 2026-04-03T18:10:21+00:00
Abstract
Current literature on radiology report evaluation has focused primarily on designing LLM-based metrics and fine-tuning small models for chest X-rays. However, it remains unclear whether these approaches are robust when applied to reports from other modalities and anatomies. Which model and prompt configurations are best suited to serve as LLM judges for radiology evaluation? We conduct a thorough correlation analysis between expert and LLM-based ratings. We compare three existing LLM-as-a-judge metrics (RadFact, GREEN, and FineRadScore) alongside VERT, our proposed LLM-based metric, using open- and closed-source models (reasoning and non-reasoning) of different sizes across two expert-annotated datasets, RadEval and RaTE-Eval, spanning multiple modalities and anatomies. We further evaluate few-shot approaches, ensembling, and parameter-efficient fine-tuning using RaTE-Eval. To better understand metric behavior, we perform a systematic error detection and categorization study to assess alignment of these metrics against expert judgments and identify areas of lower and higher agreement. Our results show that VERT improves correlation with radiologist judgments by up to 11.7% relative to GREEN. Furthermore, fine-tuning Qwen3 30B yield gains of up to 25% using only 1,300 training samples. The fine-tuned model also reduces inference time up to 37.2 times. These findings highlight the effectiveness of LLM-based judges and demonstrate that reliable evaluation can be achieved with lightweight adaptation.
Summary / 总结
Current literature on radiology report evaluation has focused primarily on designing LLM-based metrics and fine-tuning small models for chest X-rays.
Learning the Signature of Memorization in Autoregressive Language Models
Authors: David Ilić, Kostadin Cvejoski, David Stanojević, Evgeny Grigorenko
First: 2026-04-03T17:17:51+00:00 · Latest: 2026-04-03T17:17:51+00:00
Comments: Preprint. 10 pages, 4 figures, 12 tables
Abstract
All prior membership inference attacks for fine-tuned language models use hand-crafted heuristics (e.g., loss thresholding, Min-K\%, reference calibration), each bounded by the designer's intuition. We introduce the first transferable learned attack, enabled by the observation that fine-tuning any model on any corpus yields unlimited labeled data, since membership is known by construction. This removes the shadow model bottleneck and brings membership inference into the deep learning era: learning what matters rather than designing it, with generalization through training diversity and scale. We discover that fine-tuning language models produces an invariant signature of memorization detectable across architectural families and data domains. We train a membership inference classifier exclusively on transformer-based models. It transfers zero-shot to Mamba (state-space), RWKV-4 (linear attention), and RecurrentGemma (gated recurrence), achieving 0.963, 0.972, and 0.936 AUC respectively. Each evaluation combines an architecture and dataset never seen during training, yet all three exceed performance on held-out transformers (0.908 AUC). These four families share no computational mechanisms, their only commonality is gradient descent on cross-entropy loss. Even simple likelihood-based methods exhibit strong transfer, confirming the signature exists independently of the detection method. Our method, Learned Transfer MIA (LT-MIA), captures this signal most effectively by reframing membership inference as sequence classification over per-token distributional statistics. On transformers, LT-MIA achieves 2.8$\times$ higher TPR at 0.1\% FPR than the strongest baseline. The method also transfers to code (0.865 AUC) despite training only on natural language texts. Code and trained classifier available at https://github.com/JetBrains-Research/learned-mia.
Summary / 总结
All prior membership inference attacks for fine-tuned language models use hand-crafted heuristics (e.g., loss thresholding, Min-K\%, reference calibration), each bounded by the designer's intuition.
Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models
Authors: Yunfei Bai, Amit Dhanda, Shekhar Jain
Venue: KDD 2026
First: 2026-04-03T16:28:03+00:00 · Latest: 2026-04-03T16:28:03+00:00
Comments: In Proceedings of the 32nd ACM-SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)
Abstract
The recent advancements in Vision Language Models (VLMs) have demonstrated progress toward true intelligence requiring robust reasoning capabilities. Beyond pattern recognition, linguistic reasoning must integrate with visual comprehension, particularly for Chart Question Answering (CQA) tasks involving complex data visualizations. Current VLMs face significant limitations in CQA, including imprecise numerical extraction, difficulty interpreting implicit visual relationships, and inadequate attention mechanisms for capturing spatial relationships in charts. In this work, we address these challenges by presenting Chart-RL, a novel reinforcement learning framework that enhances VLMs chart understanding through feedback-driven policy optimization of visual perception and logical inference. Our key innovation includes a comprehensive framework integrating Reinforcement Learning (RL) from Policy Optimization techniques along with adaptive reward functions, that demonstrates superior performance compared to baseline foundation models and competitive results against larger state-of-the-art architectures. We also integrated Parameter-Efficient Fine-Tuning through Low-Rank Adaptation (LoRA) in the RL framework that only requires single GPU configurations while preserving performance integrity. We conducted extensive benchmarking across open-source, proprietary, and state-of-the-art closed-source models utilizing the ChartQAPro dataset. The RL fine-tuned Qwen3-VL-4B-Instruct model achieved an answer accuracy of 0.634, surpassing the 0.580 accuracy of the Qwen3-VL-8B-Instruct foundation model despite utilizing half the parameter count, while simultaneously reducing inference latency from 31 seconds to 9 seconds.
Summary / 总结
The recent advancements in Vision Language Models (VLMs) have demonstrated progress toward true intelligence requiring robust reasoning capabilities.
Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning
Authors: Zhangyun Tan, Zeliang Zhang, Susan Liang, Yolo Yunlong Tang, Lisha Chen, Chenliang Xu
First: 2026-04-03T15:36:00+00:00 · Latest: 2026-04-03T15:36:00+00:00
Abstract
VLMs trained on web-scale data retain sensitive and copyrighted visual concepts that deployment may require removing. Training-based unlearning methods share a structural flaw: fine-tuning on a narrow forget set degrades general capabilities before unlearning begins, making it impossible to attribute subsequent performance drops to the unlearning procedure itself. Training-free approaches sidestep this by suppressing concepts through prompts or system instructions, but no rigorous benchmark exists for evaluating them on visual tasks.
We introduce VLM-UnBench, the first benchmark for training-free visual concept unlearning in VLMs. It covers four forgetting levels, 7 source datasets, and 11 concept axes, and pairs a three-level probe taxonomy with five evaluation conditions to separate genuine forgetting from instruction compliance. Across 8 evaluation settings and 13 VLM configurations, realistic unlearning prompts leave forget accuracy near the no-instruction baseline; meaningful reductions appear only under oracle conditions that disclose the target concept to the model. Object and scene concepts are the most resistant to suppression, and stronger instruction-tuned models remain capable despite explicit forget instructions. These results expose a clear gap between prompt-level suppression and true visual concept erasure.
Summary / 总结
VLMs trained on web-scale data retain sensitive and copyrighted visual concepts that deployment may require removing.
CeRA: Overcoming the Linear Ceiling of Low-Rank Adaptation via Capacity Expansion
Authors: Hung-Hsuan Chen
First: 2026-02-26T11:55:25+00:00 · Latest: 2026-04-03T08:37:26+00:00
Abstract
Low-Rank Adaptation (LoRA) dominates parameter-efficient fine-tuning (PEFT). However, it faces a ``linear ceiling'': increasing the rank yields diminishing returns in expressive capacity due to intrinsic linear constraints. We introduce CeRA (Capacity-enhanced Rank Adaptation), a weight-level parallel adapter that injects SiLU gating and dropout to induce non-linear capacity expansion. We demonstrate a fundamental relationship between adapter expressivity and task complexity. In basic arithmetic (GSM8K), CeRA matches standard linear baselines, but on the complex MATH dataset, it demonstrates high parameter efficiency in downstream reasoning (Exact Match). CeRA at rank 64 (pass@1 16.36\%) outperforms both a high-rank LoRA at rank 512 (15.72\%) and the state-of-the-art linear variant, DoRA, at rank 64 (14.44\%), achieving higher exact-match accuracy with only 1/8 of the parameter budget. Empirical spectral analysis shows that CeRA activates the lower-variance tail of the singular value spectrum, preventing the rank collapse observed in linear methods and providing the representation capacity required for complex logical reasoning.
Summary / 总结
Low-Rank Adaptation (LoRA) dominates parameter-efficient fine-tuning (PEFT).
Transfer Learning for Loan Recovery Prediction under Distribution Shifts with Heterogeneous Feature Spaces
Authors: Christopher Gerling
First: 2026-04-03T07:54:49+00:00 · Latest: 2026-04-03T07:54:49+00:00
Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission
Abstract
Accurate forecasting of recovery rates (RR) is central to credit risk management and regulatory capital determination. In many loan portfolios, however, RR modeling is constrained by data scarcity arising from infrequent default events. Transfer learning (TL) offers a promising avenue to mitigate this challenge by exploiting information from related but richer source domains, yet its effectiveness critically depends on the presence and strength of distributional shifts, and on potential heterogeneity between source and target feature spaces.
This paper introduces FT-MDN-Transformer, a mixture-density tabular Transformer architecture specifically designed for TL in RR forecasting across heterogeneous feature sets. The model produces both loan-level point estimates and portfolio-level predictive distributions, thereby supporting a wide range of practical RR forecasting applications. We evaluate the proposed approach in a controlled Monte Carlo simulation that facilitates systematic variation of covariate, conditional, and label shifts, as well as in a real-world transfer setting using the Global Credit Data (GCD) loan dataset as source and a novel bonds dataset as target.
Our results show that FT-MDN-Transformer outperforms baseline models when target-domain data are limited, with particularly pronounced gains under covariate and conditional shifts, while label shift remains challenging. We also observe its probabilistic forecasts to closely track empirical recovery distributions, providing richer information than conventional point-prediction metrics alone. Overall, the findings highlight the potential of distribution-aware TL architectures to improve RR forecasting in data-scarce credit portfolios and offer practical insights for risk managers operating under heterogeneous data environments.
Summary / 总结
Accurate forecasting of recovery rates (RR) is central to credit risk management and regulatory capital determination.
Transfer learning for nonparametric Bayesian networks
Authors: Rafael Sojo, Pedro Larrañaga, Concha Bielza
First: 2026-04-01T15:25:46+00:00 · Latest: 2026-04-03T07:37:36+00:00
Comments: An earlier version was previously posted on SSRN. This version includes improvements in experiments and evaluation metrics following reviewer comments. Revision submitted to Knowledge-Based Systems
Abstract
This paper introduces two transfer learning methodologies for estimating nonparametric Bayesian networks under scarce data. We propose two algorithms, a constraint-based structure learning method, called PC-stable-transfer learning (PCS-TL), and a score-based method, called hill climbing transfer learning (HC-TL). We also define particular metrics to tackle the negative transfer problem in each of them, a situation in which transfer learning has a negative impact on the model's performance. Then, for the parameters, we propose a log-linear pooling approach. For the evaluation, we learn kernel density estimation Bayesian networks, a type of nonparametric Bayesian network, and compare their transfer learning performance with the models alone. To do so, we sample data from small, medium and large-sized synthetic networks and datasets from the UCI Machine Learning repository. Then, we add noise and modifications to these datasets to test their ability to avoid negative transfer. To conclude, we perform a Friedman test with a Bergmann-Hommel post-hoc analysis to show statistical proof of the enhanced experimental behavior of our methods. Thus, PCS-TL and HC-TL demonstrate to be reliable algorithms for improving the learning performance of a nonparametric Bayesian network with scarce data, which in real industrial environments implies a reduction in the required time to deploy the network.
Summary / 总结
This paper introduces two transfer learning methodologies for estimating nonparametric Bayesian networks under scarce data.
f-INE: A Hypothesis Testing Framework for Estimating Influence under Training Randomness
Authors: Subhodip Panda, Dhruv Tarsadiya, Shashwat Sourav, Prathosh A. P, Sai Praneeth Karimireddy
Venue: ICLR
First: 2025-10-12T09:05:47+00:00 · Latest: 2026-04-03T07:05:28+00:00
Abstract
Influence estimation methods promise to explain and debug machine learning by estimating the impact of individual samples on the final model. Yet, existing methods collapse under training randomness: the same example may appear critical in one run and irrelevant in the next. Such instability undermines their use in data curation or cleanup since it is unclear if we indeed deleted/kept the correct datapoints. To overcome this, we introduce *f-influence* -- a new influence estimation framework grounded in hypothesis testing that explicitly accounts for training randomness, and establish desirable properties that make it suitable for reliable influence estimation. We also design a highly efficient algorithm **f**-**IN**fluence **E**stimation (**f-INE**) that computes f-influence **in a single training run**. Finally, we scale up f-INE to estimate influence of instruction tuning data on Llama-3.1-8B and show it can reliably detect poisoned samples that steer model opinions, demonstrating its utility for data cleanup and attributing model behavior.
Summary / 总结
Influence estimation methods promise to explain and debug machine learning by estimating the impact of individual samples on the final model.
Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models
Authors: Haoyu Liang, Peijian Zeng, Wentao Huang, Aimin Yang, Dong Zhou
First: 2026-04-03T06:30:06+00:00 · Latest: 2026-04-03T06:30:06+00:00
Abstract
Multilingual Pre-trained Language Models (MPLMs) have become essential tools for natural language processing. However, they often exhibit biases related to sensitive attributes such as gender, race, and religion. In this paper, we introduce a comprehensive multilingual debiasing method named Multiple-Debias to address these issues across multiple languages. By incorporating multilingual counterfactual data augmentation and multilingual Self-Debias across both pre-processing and post-processing stages, alongside parameter-efficient fine-tuning, we significantly reduced biases in MPLMs across three sensitive attributes in four languages. We also extended CrowS-Pairs to German, Spanish, Chinese, and Japanese, validating our full-process multilingual debiasing method for gender, racial, and religious bias. Our experiments show that (i) multilingual debiasing methods surpass monolingual approaches in effectively mitigating biases, and (ii) integrating debiasing information from different languages notably improves the fairness of MPLMs.
Summary / 总结
Multilingual Pre-trained Language Models (MPLMs) have become essential tools for natural language processing.