AI4Science 论文速递

Snapshot: 20260311_0333

Structural Causal Bottleneck Models

Authors: Simon Bing, Jonas Wahl, Jakob Runge

First: 2026-03-09T17:50:10+00:00 · Latest: 2026-03-09T17:50:10+00:00

Abstract

We introduce structural causal bottleneck models (SCBMs), a novel class of structural causal models. At the core of SCBMs lies the assumption that causal effects between high-dimensional variables only depend on low-dimensional summary statistics, or bottlenecks, of the causes. SCBMs provide a flexible framework for task-specific dimension reduction while being estimable via standard, simple learning algorithms in practice. We analyse identifiability in SCBMs, connect them to information bottlenecks in the sense of Tishby & Zaslavsky (2015), and illustrate how to estimate them experimentally. We also demonstrate the benefit of bottlenecks for effect estimation in low-sample transfer learning settings. We argue that SCBMs provide an alternative to existing causal dimension reduction frameworks like causal representation learning or causal abstraction learning.

Summary / 总结

We introduce structural causal bottleneck models (SCBMs), a novel class of structural causal models.

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? A Study of Hierarchical Gating and Calibration

Authors: Víctor Yeste, Paolo Rosso

First: 2026-01-31T21:50:35+00:00 · Latest: 2026-03-09T17:41:24+00:00

Comments: Code: https://github.com/VictorMYeste/human-value-detection, models: https://huggingface.co/papers/2602.00913, 27 pages, 4 figures

Abs · PDF · Code1 · Code2 · Code3

Abstract

Human value detection from single sentences is a sparse, imbalanced multi-label task. We study whether Schwartz higher-order (HO) categories help this setting on ValueEval'24 / ValuesML (74K English sentences) under a compute-frugal budget. Rather than proposing a new architecture, we compare direct supervised transformers, hard HO$\rightarrow$values pipelines, Presence$\rightarrow$HO$\rightarrow$values cascades, compact instruction-tuned large language models (LLMs), QLoRA, and low-cost upgrades such as threshold tuning and small ensembles. HO categories are learnable: the easiest bipolar pair, Growth vs. Self-Protection, reaches Macro-$F_1=0.58$. The most reliable gains come from calibration and ensembling: threshold tuning improves Social Focus vs. Personal Focus from $0.41$ to $0.57$ ($+0.16$), transformer soft voting lifts Growth from $0.286$ to $0.303$, and a Transformer+LLM hybrid reaches $0.353$ on Self-Protection. In contrast, hard hierarchical gating does not consistently improve the end task. Compact LLMs also underperform supervised encoders as stand-alone systems, although they sometimes add useful diversity in hybrid ensembles. Under this benchmark, the HO structure is more useful as an inductive bias than as a rigid routing rule.

Summary / 总结

Human value detection from single sentences is a sparse, imbalanced multi-label task.

Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control

Authors: Sergey Sedov, Sumanth Bharadwaj Hachalli Karanam, Venu Gopal Kadamba

First: 2024-12-24T18:18:52+00:00 · Latest: 2026-03-09T17:23:47+00:00