We’re seeing important steps forward in the EU AI Act Code of Practice, and still, there’s more work to do. The finalized Code is here, and we’re proud that our own Dr. Rebecca Portnoff contributed to this document. It marks key progress in protecting children from AI-generated child sexual abuse material (CSAM), although some gaps remain. What’s strong: ✅ CSAM flagged as a possible systemic risk ✅ Model docs must address CSAM in training data ✅ Adversarial fine-tuning risks acknowledged ✅ Clear examples of safety practices ✅ Reporting covers serious harms to physical and mental health, like sexual extortion Where we’d like to see more iteration: ⚠️ Data cleaning still optional ⚠️ Developers set their own risk thresholds ⚠️ GPAI model definition may miss real-world threats ⚠️ Training data filtering info not shared downstream* (*EU AI Act requires that companies publicly disclose their high-level strategy for filtering illegal material from training datasets, specifically flagging CSAM.) It’s a meaningful milestone, yet also a foundation we must keep building on.
EU AI Providers, Take Note. EU AI Deployers, Pay Attention. From 2 August 2025, a new transparency rule under the EU AI Act goes live. If you're releasing a general-purpose AI model (like GPT-4, LLaMA, Claude, Gemini, or Mistral) in Europe, you’ll need to publish a public summary of your training data. The European Commission has issued a mandatory template, and it spells out exactly what’s required. Here’s what providers are expected to disclose: The model name, version, and training cut-off date The type and size of data by modality (text, image, audio, video) The major datasets used: public, private, licensed, scraped, synthetic The top 10% of websites scraped (or 1,000 domains for SMEs) Whether user data was used, and from which services Whether synthetic data was generated by other models Measures taken to exclude illegal content How copyright opt-outs (like robots.txt) were respected You don’t have to name every file. But your summary must be comprehensive enough for rightsholders, regulators, and researchers to trace where your model’s "knowledge" came from. And if you’re building on top of someone else’s foundation model, fine-tuning, aligning, distilling, it doesn’t matter if you’re an established tech firm or a scrappy startup. If you’re putting that system on the EU market, congratulations: you’re a provider now. You must disclose the content used in your modification, and link to the original model’s summary if it exists. The AI Office expects clarity at every layer of the stack. If you’re a deployer (a school, company, or public sector body purchasing or using AI) you should be able to: Ask your AI vendor: Where’s your Article 53 summary? Expect it to be publicly posted on their official website and distribution channels Check for disclosures on scraped domains, synthetic data, and copyright opt-out compliance Demand clarity, and accountable AI The era of “don’t ask, don’t tell” in AI is ending. The European Union just made that official. *** This post is for informational purposes only and does not constitute legal advice. If you are a provider or deployer of AI systems, consult your compliance and legal counsel to ensure compliance with the EU AI Act and related obligations.