In our #NeurIPS2024 paper „I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token“, we equip LLMs with a special „I Don’t Know“ (IDK) token that is learned to be predicted instead of…

ML Intern @ Apple | ELLIS PhD Student at Hasso Plattner Institute

7mo Edited

In our #NeurIPS2024 paper „I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token“, we equip LLMs with a special „I Don’t Know“ (IDK) token that is learned to be predicted instead of „hallucinating“ wrong completions. We finetune LLMs with a novel IDK objective to shift probability mass towards the new IDK token for uncertain predictions. By design, the IDK objective does not require any annotations or handcrafted examples! All details in our paper: https://lnkd.in/dJX-FX7a (joint work with Roi Cohen, Eden Biran and Gerard de Melo).

Gerard de Melo

Professor for Artificial Intelligence at HPI, University of Potsdam

7mo

LLMs "hallucinate": making up information instead of revealing their lack of knowledge 🤔 Our #NeurIPS2024 paper adds an 𝙸–𝙳𝚘𝚗'𝚝–𝙺𝚗𝚘𝚠 🤷🏾♂️ token to the LLM and fine-tunes it to predict this instead of hallucinating. No extra annotation is needed for training! The regular cross-entropy loss shifts all wrong probability mass to the gold token, while our IDK objective divides the shifted probability mass between the gold and 𝙸–𝙳𝚘𝚗'𝚝–𝙺𝚗𝚘𝚠 🤷🏾♂️ token for wrong predictions. Joint work w/ Roi Cohen, Konstantin Dobler, Eden Biran

To view or add a comment, sign in

Konstantin Dobler’s Post

Explore topics