Joe Kwon

Joe Kwon

Working on AI going well
How minds work, how AI should

Thinking about how AI and good futures fit together.

work

the path here

pre-college

Summer programs—CTY and Canada/USA Mathcamp. The kind of place where you'd stay up until 3am working on a problem set, not because it was due, but because you weren't done yet. I wanted more of that feeling.

Yale

Studied CS and psychology but spent most of my time in research labs. Worked with Gabriel Kreiman on visual cognition, then Julian Jara-Ettinger on how we infer what people are thinking from what they leave behind. I was drawn to cognition generally—human minds felt like the obvious place to start understanding intelligence.

early AI safety

Around 2020 I started paying attention to how much emergent capability was showing up in AI systems. Worked on one of OpenAI's early RLHF projects and spent time at Berkeley with Jacob Steinhardt and Dan Hendrycks. Learned how to do empirical ML research and why evals and benchmarks matter.

MIT

Joined Josh Tenenbaum's Computational Cognitive Science Lab, working with Sydney Levine on moral and social cognition—how people reason about rules, norms, and each other. A lot of neuro-symbolic modeling, which matters for AI alignment too. Separately, worked with Stephen Casper and Dylan Hadfield-Menell on red-teaming methods to find where language models fail.

LG AI Research

Research engineering on cross-lingual LLMs under Honglak Lee, working with Lajanugen Logeswaran, Dongsub Shim, and Tolga Ergen. Synthetic data, pretraining, finetuning, evals. One thread I liked: leveraging language-invariant concepts so models can learn new languages more efficiently.

steering and probing

Worked with David Krueger's group testing activation steering methods. A lot of these techniques promise fine-grained control over model behavior from the inside—we wanted to know where that actually holds up and where it breaks down.

policy

Center for AI Policy first, writing reports on evals, transparency, and AI agents. Then GovAI's DC fellowship, working on risks from internal AI deployment and metrics for tracking automated AI R&D.

now

Astra Fellow working with Tom Davidson and Fabien Roger. Focused on secretly loyal AI—threat modeling and designing ML experiments.

rabbit holes

reading

Updating soon.

listening
hip hop
I LAY DOWN MY LIFE FOR YOU JPEGMAFIA experimental / industrial
LP! (Offline) JPEGMAFIA experimental / glitch
jazz(y)
The Black Saint and the Sinner Lady Charles Mingus avant-garde
Hot Rats Frank Zappa jazz-rock
art pop
LUX Rosalía orchestral
La Vida Era Más Corta Milo j contemporary folk
Vanisher, Horizon Scraper Quadeca folktronica
electronic
I Love My Computer Ninajirachi house / dance / pop
Allbarone Daxter Dury synth pop / electropop
The Provocateur ADÉLA pop / dance / house
rock
Fetch Melt-Banana noise / experimental
Pain to Power Maruja post-punk / jazz
looking

Updating soon.

bookmarks