I work on ensuring that AIs remain beneficial and aligned with human values as they become more capable. My research focuses on developing methods for understanding, evaluating, and controlling advanced AI systems. I'm particularly interested in AI R&D automation and threat models posed by it, as well as alignment techniques leveraging high-compute RL to develop mitigations that scale to superintelligence.
I am currently a MATS 8.0 scholar researching exploration hacking, mentored by Scott Emmons, David Lindner, Roland Zimmermann, and Stephen McAleer.
Previously, I spent 6 years as a quantitative researcher on Wall Street. I received my MSc in Statistics and Machine Learning (with Distinction) from the University of Oxford, where I was supervised by Prof. Yee Whye Teh and Prof. Benjamin Bloem-Reddy.
I find it deeply fulfilling to help talented researchers transition into AI safety. I'll be mentoring 4 SPAR projects this fall!
Current Projects:
I plan to mentor more projects through other AI safety programs. If you're interested in working together, please reach out with your background and research interests.
Reviewer: NeurIPS 2025 Mechanistic Interpretability workshop