Konstantin Dobler’s Post

View profile for Konstantin Dobler

ML Intern @ Apple | ELLIS PhD Student at Hasso Plattner Institute

In our #NeurIPS2024 paper „I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token“, we equip LLMs with a special „I Don’t Know“ (IDK) token that is learned to be predicted instead of „hallucinating“ wrong completions. We finetune LLMs with a novel IDK objective to shift probability mass towards the new IDK token for uncertain predictions. By design, the IDK objective does not require any annotations or handcrafted examples! All details in our paper: https://lnkd.in/dJX-FX7a (joint work with Roi CohenEden Biran and Gerard de Melo).

View profile for Gerard de Melo

Professor for Artificial Intelligence at HPI, University of Potsdam

LLMs "hallucinate": making up information instead of revealing their lack of knowledge 🤔 Our #NeurIPS2024 paper adds an 𝙸–𝙳𝚘𝚗'𝚝–𝙺𝚗𝚘𝚠 🤷🏾♂️ token to the LLM and fine-tunes it to predict this instead of hallucinating. No extra annotation is needed for training! The regular cross-entropy loss shifts all wrong probability mass to the gold token, while our IDK objective divides the shifted probability mass between the gold and 𝙸–𝙳𝚘𝚗'𝚝–𝙺𝚗𝚘𝚠 🤷🏾♂️ token for wrong predictions. Joint work w/ Roi Cohen, Konstantin Dobler, Eden Biran

  • Illustration of shifting probability mass not just to gold token but also to I-Don't-Know token for wrong predictions

To view or add a comment, sign in

Explore topics