Where Not to Learn: Prior-Aligned Training with Subset-based Attribution Constraints for Reliable Decision-Making

Ruoyu Chen , Shangquan Sun , Xiaoqing Guo , Sanyi Zhang , Kangwei Liu , Shiming Liu , Zhangcheng Wang , Qunli Zhang , Hua Zhang , Xiaochun Cao

🏛 Institutions: Institute of Information Engineering , CAS , University of Chinese Academy of Sciences , NTU , Hong Kong Baptist University , Communication University of China , Huawei , USTC , Sun Yat-sen University
📅 Date: January 30, 2026
📑 Publisher: arXiv
💻 Env: General GUI
🔑 Keywords: training attribution subset-based attribution constraints human prior alignment reliability explainability

TLDR

This paper proposes prior-aligned training with subset-based attribution constraints, penalizing models for relying on evidence that conflicts with human priors. It improves both accuracy and decision reasonability on image classification and GUI-agent click decision tasks.

Open paper arXiv Report issue