TY - GEN
T1 - LLM Explainability via Attributive Masking Learning
AU - Barkan, Oren
AU - Toib, Yonatan
AU - Elisha, Yehonatan
AU - Weill, Jonathan
AU - Koenigstein, Noam
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - In this paper, we introduce Attributive Masking Learning (AML), a method designed for explaining language model predictions by learning input masks.AML trains an attribution model to identify influential tokens in the input for a given language model's prediction.The central concept of AML is to train an auxiliary attribution model to simultaneously 1) mask as much input data as possible while ensuring that the language model's prediction closely aligns with its prediction on the original input, and 2) ensure a significant change in the model's prediction when applying the inverse (complement) of the same mask to the input.This dual-masking approach further enables the optimization of the explanation w.r.t.the metric of interest.We demonstrate the effectiveness of AML on both encoder-based and decoder-based language models, showcasing its superiority over a variety of state-of-the-art explanation methods on multiple benchmarks.Our code is available at: https://github.com/amlconf/aml.
AB - In this paper, we introduce Attributive Masking Learning (AML), a method designed for explaining language model predictions by learning input masks.AML trains an attribution model to identify influential tokens in the input for a given language model's prediction.The central concept of AML is to train an auxiliary attribution model to simultaneously 1) mask as much input data as possible while ensuring that the language model's prediction closely aligns with its prediction on the original input, and 2) ensure a significant change in the model's prediction when applying the inverse (complement) of the same mask to the input.This dual-masking approach further enables the optimization of the explanation w.r.t.the metric of interest.We demonstrate the effectiveness of AML on both encoder-based and decoder-based language models, showcasing its superiority over a variety of state-of-the-art explanation methods on multiple benchmarks.Our code is available at: https://github.com/amlconf/aml.
UR - http://www.scopus.com/inward/record.url?scp=85214724964&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85214724964
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
SP - 9522
EP - 9537
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
A2 - Al-Onaizan, Yaser
A2 - Bansal, Mohit
A2 - Chen, Yun-Nung
PB - Association for Computational Linguistics (ACL)
T2 - 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
Y2 - 12 November 2024 through 16 November 2024
ER -