Arshavir Blackwell takes you on a journey inside the black box of large language models, showing how cutting-edge methods help researchers identify, understand, and even fix the inner quirks of AI. Through concrete case studies, he demonstrates how interpretability is evolving from an arcane art to a collaborative science—while revealing the daunting puzzles that remain. This episode unpacks the step-by-step workflow and surprising realities of mechanistically mapping model cognition.