A new study by researchers from Technion - Israel Institute of Technology and Google Research highlights the risks of fine-tuning large language models (LLMs) with new factual information, showing that fine-tuning LLMs can lead to increased hallucinations and incorrect outputs. This groundbreaking work by Zorik Gekhman, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder, Roi Reichart, and Jonathan Herzig underscores the importance of careful management in AI model development. * * * Fine-tuning is the process of further training a pre-trained LLM to better align it with specific tasks or behaviors, and to refine the model's performance for particular tasks. Fine-tuning is usually done by supervised learning where the model is trained on outputs created by human annotators or other LLMs, often introducing new factual information not covered in the pre-training data. This fine-tuning helps the model learn specific instructions and preferences, improving its performance in targeted applications. * * * The study finds that: - LLMs struggle to assimilate new factual knowledge during fine-tuning, learning new information significantly slower than existing knowledge. - Introducing new knowledge through fine-tuning increases the model's tendency to produce factually incorrect responses, known as hallucinations. - LLMs primarily acquire factual knowledge during pre-training, while fine-tuning enhances the efficient use of this knowledge. * * * Methodology and details: - To study the impact of new knowledge on LLMs during fine-tuning, the researchers developed SliCK, a system that categorizes examples into Known and Unknown, with Known further divided into HighlyKnown, MaybeKnown, and WeaklyKnown. - In the controlled research setup, the researchers varied the proportion of Unknown examples introducing new knowledge. - Results indicated that while LLMs learn new facts slowly from Unknown examples which increases hallucinations, while Known examples enhance existing knowledge use. - Early-stopping or filtering out Unknown examples reduces hallucinations. - Including MaybeKnown examples improves performance by handling uncertainties better during testing. - Collectively, the findings highlight the potential for unintended consequences when introducing new knowledge through fine-tuning, and imply that fine-tuning may be more useful as a mechanism to enhance the utilization of pre-existing knowledge. * * * Key take-away: Acquiring new knowledge through supervised fine-tuning is linked to increased hallucinations relative to the model's existing knowledge. LLMs struggle to integrate new knowledge during fine-tuning and primarily learn to utilize what they already know. * * * For a deeper dive into this important research, check out the full paper: "Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?", https://lnkd.in/gBdTUDPr #AI #ML #NLP #FineTuning #Transparency #hallucinations
Katharina, this study sheds light on a critical aspect of AI development that often goes unnoticed. It's fascinating to see how fine-tuning, which aims to enhance model performance, can sometimes lead to unexpected challenges like increased hallucinations. Your insights on this topic are always valuable. What do you think could be the most effective strategy to balance fine-tuning with minimizing hallucinations in large language models?
Katharina Koerner always a great write-up and a great share 👍
A greay read. Thank you Katharina Koerner for sharing this.
Not if its not correlated!
Carlos Muñoz Ferrandis
Insightful!
Structured Solutions Architect at Causal Capital
6moKatharina, thanks for sharing. "When large language models are supervised or fine tuned, they may encounter new factual information" -- By extension, it would seem logical that calibration would impact outcome, and it's possible that such calibrations can have negative side effects on accuracy, introducing bias rather than reducing it. This is an interesting observation with far reaching implications on the purity of AI. It's possible that policy intervention, ethical or not, may damage that purity objective and morph overall statistical reality.