Thorn reposted this
It was great to be back in Dublin this week for Google’s Growing Up in the Digital Age Summit! A range of critical topics were discussed - digital age verification, future risks for online child sexual abuse, AI quality, AI safety and more. Getting to spend time in person with partners, and meet new dedicated professionals driving forward crucial work is always a privilege. I was glad to get the chance to share with the audience Thorn’s ongoing work to prevent the misuse of generative AI for furthering child sexual abuse. As part of the conversation, I had the chance to highlight some key areas where there are still critical gaps, which I’ll share here as well (with more related resources in the comments): 1. We need to research and invest in scalable, reliable model assessments that are not overly reliant on prompt/response strategies. Current strategies for model assessments are inherently manual: they boil down to evaluating using prompts and assessing outputs. Given the pace and scale of newly released models into the ecosystem, and the specific sensitivities of assessing AIG-CSAM related harms, these aren’t sufficient. We need to explore strategies that involve ascertaining a model’s learned concepts (e.g. from its latent space) and reliably mapping these concepts to a model’s capabilities. 2. We need to research and invest in model training strategies that prevent and mitigate adversarial fine tuning downstream. Right now, in the open source/open weight setting, a good faith developer can follow all the best practices for developing a model to reduce or remove its AIG-CSAM capabilities, and then have that work undone by downstream adversarial fine tuning and optimization. 3. We need to move much more quickly to prevent the upload of nudifying apps and services to platforms, and to remove nudifying apps and services from search results. In research that Thorn (Amanda Goharian) released earlier this week, we learned that young people who admit to creating sexual deepfakes continue to have a simple and straightforward path to accessing and using nudifying technologies through app stores, search engines, and social media platforms. In that same research, 6% of American teens disclosed having been a direct victim of this form of abuse. As always, there’s more work to be done - but I’m thankful for the many people who are putting in the effort. Frances Frost John Buckley Jess Lishak Michelle Jeuken and many more!