Charith Peris, PhD’s Post

Senior Applied Scientist | Responsible AI | Artificial General Intelligence at Amazon

3mo

Here’s a blogpost that breaks down our work on Attribute Controlled Fine-tuning for Large Language Models . We introduced a new method that trains an auxiliary model to control a specific attribute (in this case toxicity). The approach regularizes the LLM's training by penalizing deviations from the desired (non-toxic) distribution, using the auxiliary model trained alongside the core LLM. This work, published at #EMNLP2024, was led by our intern Tao Meng (UCLA) together with our collaborators Ninareh Mehrabi, Palash Goyal, PhD, Anil Ramakrishna, PhD, Aram Galstyan, Richard Zemel, Kai-Wei Chang and Rahul Gupta. Blogpost: https://lnkd.in/eRzvKSKi Paper: https://lnkd.in/eHNwgWAt

Detoxification of large language models via regularized fine-tuning

amazon.science

To view or add a comment, sign in

More Relevant Posts

Mudar Yaghi
3mo
Report this post
AppTek.ai | Yingbo (Ringo) Gao presenting “Prompting and Fine-Tuning of Small LLMs for Length-Controllable Telephone Call Summarization” paper by: David Thulke | Yingbo (Ringo) Gao | Rricha Jalota | Christian Dugast | Prof Hermann Ney who is also present at the conference. This paper explores the rapid development of a telephone call summarization system utilizing large language models (LLMs). The approach involves initial experiments with prompting existing LLMs to generate summaries of telephone conversations, followed by the creation of a tailored synthetic training dataset utilizing stronger frontier models ... https://lnkd.in/eRH45cCB

The International Conference on Foundation and Large Language Models (FLLM2024)

fllm-conference.org
Like Comment
To view or add a comment, sign in
DeepLearning.AI

1,161,190 followers
7mo
Report this post
Researchers at the University of Oxford developed a method to identify hallucinations in large language model outputs. Their approach estimates the likelihood of hallucinations by calculating the degree of uncertainty based on the distribution of generated meanings rather than word sequences. This method outperformed traditional entropy and P(True) methods, achieving an average AUROC of .790 across multiple datasets and models. Read our summary of the paper in #TheBatch: https://hubs.la/Q02HKD990

Oxford Scientists Propose Effective Method to Detect AI Hallucinations

deeplearning.ai

13 Comments
Like Comment
To view or add a comment, sign in
Karine Boucher

Delivering Human-Centric AI Governance Experiences & Strategies | Data Science & Machine Learning | Product Management | Marketing | Cybersecurity | Business Development | AI For Good
7mo
Report this post
Surely great news to ensure safer and more effective AI deployment in various fields! In other more digestible words, Oxford scientists offered a new method proposed for detecting AI hallucinations (when AI systems generate false or misleading information). The researchers have developed a technique that enhances the reliability of AI outputs by identifying inconsistencies and discrepancies in generated content. This aims to improve the trustworthiness of AI systems, especially in critical applications where accuracy is essential. #hallucinations #AI #LLM #accuracy

DeepLearning.AI

1,161,190 followers
7mo

Researchers at the University of Oxford developed a method to identify hallucinations in large language model outputs. Their approach estimates the likelihood of hallucinations by calculating the degree of uncertainty based on the distribution of generated meanings rather than word sequences. This method outperformed traditional entropy and P(True) methods, achieving an average AUROC of .790 across multiple datasets and models. Read our summary of the paper in #TheBatch: https://hubs.la/Q02HKD990

Oxford Scientists Propose Effective Method to Detect AI Hallucinations

deeplearning.ai
Like Comment
To view or add a comment, sign in
Asad Khan
7mo
Report this post
Great method to identify hallucinations in Large language model outputs…

DeepLearning.AI

1,161,190 followers
7mo

Researchers at the University of Oxford developed a method to identify hallucinations in large language model outputs. Their approach estimates the likelihood of hallucinations by calculating the degree of uncertainty based on the distribution of generated meanings rather than word sequences. This method outperformed traditional entropy and P(True) methods, achieving an average AUROC of .790 across multiple datasets and models. Read our summary of the paper in #TheBatch: https://hubs.la/Q02HKD990

Oxford Scientists Propose Effective Method to Detect AI Hallucinations

deeplearning.ai
Like Comment
To view or add a comment, sign in
Taimoor Malik

Data Scientist @ Asia School of Business | MS in Computational Analytics-Georgia Tech
7mo
Report this post
🚨 Exciting Research Alert! 🚨 I'm thrilled to share a groundbreaking study that has just been published in Nature: "Detecting Hallucinations in Large Language Models Using Semantic Entropy" by Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal. I believe this research can significantly enhance the reliability of AI applications, paving the way for safer and more trustworthy AI systems. I'm particularly excited about applying these insights to my own projects, as this method represents a major step forward in addressing the critical issue of hallucinations in LLMs. I'm eager to see how this innovation will transform the field and am keen to hear your thoughts on its potential implications! #AI #MachineLearning #LanguageModels #SemanticEntropy #Research #AItrustworthiness #TechInnovation #NaturePublication #LLM #AIResearch Let's discuss the future of AI reliability! 💬🔧

DeepLearning.AI

1,161,190 followers
7mo

Researchers at the University of Oxford developed a method to identify hallucinations in large language model outputs. Their approach estimates the likelihood of hallucinations by calculating the degree of uncertainty based on the distribution of generated meanings rather than word sequences. This method outperformed traditional entropy and P(True) methods, achieving an average AUROC of .790 across multiple datasets and models. Read our summary of the paper in #TheBatch: https://hubs.la/Q02HKD990

Oxford Scientists Propose Effective Method to Detect AI Hallucinations

deeplearning.ai
Like Comment
To view or add a comment, sign in
Shuo-Fu (Michael) Chen

Applied Scientist
7mo Edited
Report this post
From Language Models to Models of Mind (https://lnkd.in/g5rM9U35) Postulate of General Intelligence Formation: General intelligence can be formed in a conditional language generation system by training it with sufficiently large amount of high-quality knowledge to perform well above average human adult on all the tasks required for intelligence evaluations.

From Language Models to Models of Mind

msfchen.github.io
Like Comment
To view or add a comment, sign in
Pascal Biese

Daily AI highlights for 70k+ experts 📲🤗 AI/ML Engineer
8mo
Report this post
Memorization -> Generalization 🧠 💡 Large Language Models (LLMs) have shown remarkable performance across a wide range of natural language tasks. However, even the most capable models struggle with tasks that require implicit reasoning over parametric knowledge, such as composition and comparison. This limitation hinders their ability to systematically generalize to out-of-distribution examples, which is crucial for robust and reliable performance in real-world applications. New research found that transformers can learn implicit reasoning, but only through a process called "grokking," which involves extended training far beyond the point of overfitting. During grokking, the model forms a generalizing circuit that enables it to reason effectively. The efficiency of this circuit relative to memorizing circuits plays a key role in the model's ability to generalize. To learn more about grokking and other paper highlights, check out this week's newsletter: https://lnkd.in/dJrhkFsJ

5 Comments
Like Comment
To view or add a comment, sign in
Vincent Boucher

President of Montreal.AI and Quebec.AI
3mo
Report this post
Characterizing the Role of Similarity in the Property Inferences of Language Models Rodriguez et al.: https://lnkd.in/e_pQfN4T #Artificialintelligence #DeepLearning #MachineLearning
Like Comment
To view or add a comment, sign in
Antonio Montano 🪄

Delivering perpetual agility via technology ✨
4mo
Report this post
💥💥💥 Implicit Personalization in Language Models: A Systematic Study Zhijing Jin, Nils Heil, Jiarui Liu, Shehzaad Dhuliawala, Yahang Qi, Bernhard Schölkopf, Rada Mihalcea, Mrinmaya Sachan Abstract Implicit Personalization (IP) is a phenomenon of language models inferring a user's background from the implicit cues in the input prompts and tailoring the response based on this inference. While previous work has touched upon various instances of this problem, there lacks a unified framework to study this behavior. This work systematically studies IP through a rigorous mathematical formulation, a multi-perspective moral reasoning framework, and a set of case studies. Our theoretical foundation for IP relies on a structural causal model and introduces a novel method, indirect intervention, to estimate the causal effect of a mediator variable that cannot be directly intervened upon. Beyond the technical approach, we also introduce a set of moral reasoning principles based on three schools of moral philosophy to study when IP may or may not be ethically appropriate. Equipped with both mathematical and ethical insights, we present three diverse case studies illustrating the varied nature of the IP problem and offer recommendations for future research. 👉 https://lnkd.in/d-_Avd8k #machinelearning
Like Comment
To view or add a comment, sign in
Li Huang

Office Manager for Onit AI Centre of Excellence
4mo
Report this post
LLM Roadmaps/History/Meaning/Intelligence ( just keep it as a note...I haven't understood 30% of the information yet...) 1913 Markov Chains Transition probabilities between consonants and vowels in Alexander Pushkin’s verse novel, Eugene Onegin Significance: An example of statistical investigation of the text Eugene Onegin concerning the connection of samples in Chains 1948/1951 Word n-gram models Claude E. Shannon A mathematical theory of communication Prediction and entropy of printed English Significance: Explores character and word-level n-gram models, with estimation and generation from a small text sample. 1975 Language model Frederick Jelinek’s Group, IBM Design of a linguistic statistical decorder for the recognition of continuous speech Significance: the work from the group at IBM defined the probabilistic language model of next token prediction that has continued to dominate till today. 1998 First use of LLM trigram (according to Chris Manning) CPAT-Tree-Based Language Models with an application for text verification in Chinese, 200 M word corpus 2000 First neural langauge model built on 32 million token corpus, 31K vocab A Neural probabilistic language model. 2007 2 Trillion token corpus n-gram model of up to 5-grams Large Language Models in Machin Translation 2018. 3.3 billion token corpus GPT and BERT 2020 onwards. >1 trillion tokens 100+ billion parameter neural langauge models : GPT-3, GPT-4, PaLM2, Llama 3, Nemotron-4,… From Chris: why didn’t people try to improve AI by improving LLM? Because LLM is just a statistical thing, it does not really understand language. Everyone kind of accepted the idea that what we are doing is to build the approach towards explicit semantics... It is wrong! (This might suggest that a plain language model has no notion of meaning, but a visual language model does?) VLM captures meaning while a pure LLm does not?

Chris Manning - Meaning and Intelligence in Language Models (COLM 2024)

https://www.youtube.com/
Like Comment
To view or add a comment, sign in

2,368 followers

69 Posts

View Profile Connect

Charith Peris, PhD’s Post

More Relevant Posts

Chris Manning - Meaning and Intelligence in Language Models (COLM 2024)

https://www.youtube.com/

Explore topics