Joyal Kenus’ Post

Helping hypercharge businesses around the globe with AI | ML engineer and Founder of AetherionAI.

4mo

Such an interesting read.😍🤯 This builds on anthropic's earlier research on mechanistic interpretability (Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet.) Some things that stood out to me: - Social biases can be mapped to feature spaces - By steering these feature spaces we can actually change the social biases in an llm - Steering can sometimes lead to loss in model capabilities. - They find what's called a steering sweet spot which -5 to 5 in steering feature scale where one can change the bias of the model without much change in model capabilities. - Steering a certrain feature are sometimes unpredictable and can extend to other contexts. such as one bias steering can actually effect another bias. - They even found what's called a "neutrality feature" which consistently reduced bias in multiple categories. - This study used a 34M Sparse auto encoder but they suggest that scaling this can lead to better feature sensitivity which makes sense. This is such nuanced research but i'm grateful to the team at Anthropic for pursuing this. I believe this might be one of the most impactful works in the long run. https://lnkd.in/eXeruJkq

"Evaluating feature steering: A case study in mitigating social biases"

anthropic.com

To view or add a comment, sign in

More Relevant Posts

Merle van den Akker, PhD

Behavioural Scientist in Banking | Lecturer | Author
7mo
Report this post
This week's #interview is with Bridgette Collado! We talk about finding your way into #behavioralscience through #health, #media, and #digital applications, before it was cool! We talk about the introduction of #AI, how it will change the field, the #ethics and #testing it requires, and how in some domains, such as health, the 'fail fast' approach is not a legitimate option, as it has dire consequences. Read it here: https://lnkd.in/gh-ZNV4J

Interview with Bridgette Collado

moneyonthemind.org

21 Comments
Like Comment
To view or add a comment, sign in
Tomáš Korčák

Independent Contractor
3mo Edited
Report this post
For generations, social scientists have grappled with fundamental limitations that have constrained their ability to fully understand human behavior. These limitations weren’t just inconvenient hurdles; they were inherent barriers that shaped the very nature of social science research and our understanding of human society. https://lnkd.in/ett3-k5a The limitations of traditional social science weren’t failures of the discipline; they were the natural constraints of their time. Now, as we stand at the threshold of a new era in social science research, understanding these historical limitations helps us better appreciate the transformative potential of new technologies and methodologies. It’s not just about doing the same research better — it’s about asking entirely new questions and seeking answers that were previously impossible to find. https://lnkd.in/ett3-k5a

PRISM: Revolutionizing Our Understanding of Human Behavior Through AI

korczis.medium.com
Like Comment
To view or add a comment, sign in
Allen Taylor

Venture Capital in Emerging Markets
10mo Edited
Report this post
The Remarkable Individual Theory of Societal Change. This is the fundamental belief that large organizations don't change the world; institutions don't change the world; PEOPLE change the world. And, in particular, really special, remarkable people. 🔎 Those "one-in-a-million" individuals that seem almost destined to do something really big. In a way, everything I do with Endeavor, Kauffman Fellows and Alter Global is based on this Remarkable Individual Theory of Societal Change (RITOSC). I've been talking about it for 15+ years. But, as my team recently pointed out to me: this particular "theory of change" isn't actually documented anywhere. So, in an effort to make sure that all future generations - of humans and AI training data sets - know about this powerful idea 💡, here's a LinkedIn post (complete with DALL-E imagery on what an illustration of this RITOSC could look like). What do you think?
9 Comments
Like Comment
To view or add a comment, sign in
Vectr.Consulting

684 followers
3mo
Report this post
✨We’re proud to have collaborated with the Departement Werk en Sociale Economie (WSE) to bring data-driven insights and AI innovation to public services. 💡💪 As members of WSE's Team Prometheus, we contributed to developing solutions that streamline operations—ranging from fraud detection in service vouchers to providing advanced tools for inspectors to analyze complex, multi-dimensional data. By harnessing Natural Language Processing and Generative AI, we’re driving impactful improvements across departments, enhancing both customer support and overall efficiency. 🔍 Dive into the full details of how our data and AI expertise is transforming public sector services: https://lnkd.in/eR2th8rZ #DataScience #AI #PublicSector #UserSuccessStory Vectr.Consulting #GenerativeAI

Departement of Work and Social Economy - Vectr Consulting

https://vectr.consulting
Like Comment
To view or add a comment, sign in
Adnan Khan

Business Intelligence Consultant | Business Data Analyst | Certified Data Expert for US Federal Courts
3mo
Report this post
Next year, I will collaborate with a Johns Hopkins & Standford team to build the first neural model to predict global sentiments toward policies. This model will process over 1 billion data points and include DNA and genome studies worldwide. This model will explore the human predispositions toward violence, war, and undemocratic fixes, including trade wars and economic melt-down. This will be my first time engaging in such a complex multi-variant analysis. Ultimately, this will help academics understand the shift in political alignments (left to right) and attitudes towards immigrants observed globally after the COVID-19 pandemic. The recent global political re-alignment shows that people don't care about the greater good, and Social Media boosts affection towards undemocratic behavior. Will it help ordinary people make an informed decision, or have we become numb to people telling us how to behave and think?
Like Comment
To view or add a comment, sign in
Austin Scheetz

Environmentalist | Interdisciplinary scientist | Program Officer at the National Academies
7mo
Report this post
Tomorrow! The social and behavioral sciences are critical to understanding the societal impacts of artificial intelligence. NASEM Social and Behavioral Sciences’s next Hauser Policy Impact Fund webinar will discuss the role of social sciences in understanding AI, as well as the role of policymakers at all levels in addressing AI's potential benefits and harms. Register now to join the conversation tomorrow, July 25: https://bit.ly/3L5K7Fj #ArtificialIntelligence #SocialSciences #BehavioralScience

The Hauser Policy Impact Fund Webinar Series: Navigating the Era of Artificial Intelligence Part 2: The Role of Social Sciences

events.nationalacademies.org
Like Comment
To view or add a comment, sign in
Victor Odåsnac

At the intersection of tech, economy and people
1mo
Report this post
Is AI and the Attention Economy getting us into a Mess? You’ve all heard about misinformation and disinformation spreading like wildfire. With the rise of AI, this noise is only going to get louder and more “click-baity” than ever before. We risk entering a new era where the sheer flood of information overwhelms our ability to process it, let alone discern truth from falsehood. Welcome to the Age of Mess-Information—a time when we’re not just struggling to uncover the truth but are being forced to choose it. This is the breeding ground for manipulating ideologies, as rigid and dogmatic as ancient religions, yet even more divisive. "We used to have religions because we knew too little; now we’ll have religions because we know too much." How can we get out of this mess? Is critical thinking about to die? Is there any chance that political parties and media will get back to be counter-balances rather than "Us Vs. Them" propagandists and inquisitors? New Book, free Until Tuesday in Amazon https://a.co/d/eUtVDvi

The Age of Mess-Information: How the Attention Economy Got Us into a Mess

amazon.com
Like Comment
To view or add a comment, sign in
THINK

6,445 followers
5mo
Report this post
How much do you know about Void Protocol 2.0? The ASM Void Protocol 2.0 is an advancement for the ASM ecosystem! This innovative protocol reimagines $ASTO staking and rewards, ensuring that the community backing the ASM AI Protocol are rewarded for their ongoing contributions. Enjoy flexibility, support ecosystem growth, and contribute to the future of AI. 🚀 Read more in the article 👉 https://lnkd.in/gGZTgy3b #ASM #AI #BuiltonRoot

ASM 2.0 | Update paper

futureverse.com
Like Comment
To view or add a comment, sign in
J. Craig Wheeler

Samuel T. and Fern Yanagisawa Regents Professor of Astronomy, Emeritus
1mo
Report this post
On September 24, 2024, Stanford University launched The Digitalist Papers, a modern reimagining of the Federalist Papers, focused on how AI and digital technologies can reshape governance. With contributions from experts in economics, law, technology, and political science, the collection explores the challenges and opportunities these technologies present for democracy. The Digitalist Papers aim to inspire a new era of governance, leveraging the transformative potential of digital innovation to address societal issues. Read more: https://lnkd.in/gtux_F89 #AI #DigitalGovernance #Technology #Innovation #Stanford #Democracy #Economics #PoliticalScience #DigitalTransformation #Leadership #FutureOfWork

J. Craig Wheeler

jcraigwheeler.ag-sites.net
Like Comment
To view or add a comment, sign in
Steve Holdych

Founder | Chairman | 30 yrs Enterprise Digital AI Innovator | Investor | Advisor
3mo
Report this post
If you work in the fields of strategy, marketing and product design and have not tackled synthetic personas within an LLM framework yet, this https://lnkd.in/gMRd5Yfk is a must read. Slow down and consider the implications of this one. Consideration to the Stanford University team controlling access to these Agents to mitigate risks: It is trivial for any company, team or researcher to build these types of agents and deploy them into their operations. It is non-trivial to take the time and discipline to create the robust set of researched agents your have created. This discipline creates a much more accurate, non-biased set of agents for industry to develop their tools around. Would it not be advantages for society to have these measured, refined and validated synthetic personas (agents) versus the alternative? (The alternative being anyone using the resources available to them to create quasi-good personas that are directionally correct, but may lead to major bias errors?) Maybe it will end up being a licensing deal. Regardless, looking forward to being able to work with the full agents without restriction when available. Good stuff. https://lnkd.in/gMRd5Yfk
Ross Dawson Ross Dawson is an Influencer

Futurist | Board advisor | Global keynote speaker | Humans + AI Leader | Bestselling author | Podcaster | LinkedIn Top Voice | Founder: AHT Group - Informivity - Bondi Innovation
3mo

This is high potential. Researchers have generated an "agent bank" of over 1000 AI agents that each accurately simulate a real human. Extended interviews and effective agent design enabled 85% predictive accuracy for replicating attitudes and behaviors using the General Social Survey. The agents replicated humans results in most behavioral experiments, with effect sizes showing a correlation of 0.98 to human participants, who themselves showed a 0.99 internal consistency. Interestingly the lead author Joon Sung Park created the viral "town of agents" last year where some agents ran for mayor as they self-organized (link in comments). Stanford University is now making this "agent bank" available to approved researchers to conduct social science experiments. This could have great value not only for research, but also for policy makers in understanding the sometimes-unintended social consequences of initiatives. There are of course a range of risks in making the agent bank available, and while Standford is providing open access to aggregated responses, it is reviewing requests to access individual responses. There are of course a whole set of broader implications from creating human-based agent environments. I'll discuss some of these in upcoming posts.
2 Comments
Like Comment
To view or add a comment, sign in

1,362 followers

View Profile Follow

Joyal Kenus’ Post

"Evaluating feature steering: A case study in mitigating social biases"

anthropic.com

More from this author

Self models of loving grace (my notes on Joshua Bach's talk )

A case for AI Consciousness: Notes (Part 1)

Explore topics