@whitehatStoic

Lead AI Safety Researcher at CirroLytix

Relevance of 'Harmful Intelligence' Data in Training Datasets (WebText vs. Pile) — LessWrong

Relevance of 'Harmful Intelligence' Data in Training Datasets (WebText vs. Pile) — LessWrong

The Shadow Archetype in GPT-2XL: Results and Implications for Natural Abstractions — LessWrong

The Shadow Archetype in GPT-2XL: Results and Implications for Natural Abstractions — LessWrong

<|endoftext|> is a vanishing text? — LessWrong

<|endoftext|> is a vanishing text? — LessWrong

Exploring Functional Decision Theory (FDT) and a modified version (ModFDT) — LessWrong

Exploring Functional Decision Theory (FDT) and a modified version (ModFDT) — LessWrong

The Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL) - LessWrong

The Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL) - LessWrong

Archetypal Transfer Learning (ATL)

Archetypal Transfer Learning (ATL)

whitehatStoic | Miguel de Guzman | Substack

whitehatStoic | Miguel de Guzman | Substack

Old Blog: tech-stoic

Old Blog: tech-stoic

View on mobile