𝗜𝘀 𝗵𝘂𝗺𝗮𝗻 𝗳𝗲𝗲𝗱𝗯𝗮𝗰𝗸 𝘀𝘁𝗶𝗹𝗹 𝘁𝗵𝗲 𝘀𝗲𝗰𝗿𝗲𝘁 𝘄𝗲𝗮𝗽𝗼𝗻 𝗳𝗼𝗿 𝗔𝗜 𝘀𝘂𝗰𝗰𝗲𝘀𝘀? 🏆 This was one of the most debated questions at Prolific's #BuildAIFaster Winter Summit 2025 in London. During our panel discussion "Human Feedback for Building AI Applications: Now and Future," we explored the critical role of human feedback to ensure the development of responsible and effective AI systems. 𝗛𝗲𝗿𝗲 𝗮𝗿𝗲 𝗮 𝗳𝗲𝘄 𝘁𝗮𝗸𝗲𝗮𝘄𝗮𝘆𝘀 𝗳𝗿𝗼𝗺 𝗼𝘂𝗿 𝗽𝗮𝗻𝗲𝗹 𝗱𝗶𝘀𝗰𝘂𝘀𝘀𝗶𝗼𝗻: ⬇️ ➜ Evaluating LLM outputs at scale is a critical challenge without a clear gold standard. While LLMs can serve as effective "judges" to assess outputs for accuracy and reliability, particularly in high-stakes domains like legal, medical, and financial applications where precision is paramount. This evaluation approach can help to ensure AI-generated content meets rigorous quality standards. ➜ Ensuring responsible and trustworthy AI necessitates active human involvement. While Large Language Models (LLMs) can assess data and model quality, solely relying on humans isn't scalable. A balanced approach is crucial. ➜ We also briefly discussed DeepSeek. DeepSeek exemplifies innovation by reimagining existing methods. The real genius of DeepSeek is that it didn’t rely on groundbreaking, new techniques. Instead, it took existing methods and reimagined their use, combining them in surprisingly effective ways that had been overlooked until now. I want to extend a BIG thank you to my great fellow panelists: Maurice Burger, CEO of atla, Alex Anwyl-Irvine, Senior Research Scientist at the AI Safety Institute, and Sara Saab, VP Product at Prolific! I would also like to express my sincere gratitude to the outstanding team at Prolific for their exceptional organization and for hosting such a great event in the heart of London.
You can't remove the expert human in the loop. AI will always hallucinate errors. Average humans will not see the error. You need a human expert to spot and correct the errors.
It's part of it for sure 😊
Human feedback + AI = the perfect balance for responsible innovation!
It was great seeing you there Andreas!
This description of DeepSeek perfectly illustrates the Asian approach to innovation: Instead of seeking radical breakthroughs, it follows the principle of continuous improvement (Kaizen) by cleverly combining and reimagining existing methods. This mixing and refining of established techniques, rather than revolutionary inventions, is exactly what makes the approach appear paradoxically radical to Western observers, despite being built on incremental innovation.
An important feature of the brain is abstraction. Experienced experts actually rely on intuition and experience to correct AI's mistakes! The reason why AI has problems is actually that the knowledge it chooses is not broad enough, and the aspects of thinking are not broad enough. We see its mistakes now, and it will definitely do better in the future! Today's experts will be forgotten in the future, so it is very important to formulate an evolutionary ethical framework for artificial intelligence now. In the future, we may not be able to understand AI at all, let alone formulate any direction.
Such a good conversation Andreas Horn. It’s not such a different model to the progress of automation and human labor in the industrial age in general, in the end :) Humans will increasingly be essential for increasingly high stakes work in AI dev — and what’s automatable will be automated 🙂 But we remain and our importance compounds
Interesting panel. What I'm wondering: if fine tuning relies more'n more on synthetic data and if RL-feedback comes from models instead of humans, what does mean for reliability and accuracy in future models? Do you have an opinion on this?
Blockchain | AI | Cloud | Security
4wExciting