Current Members
Atticus Geiger, Principal Investigator
Atticus has a B.S. in Symbolic Systems, an M.S. in Computer Science, and PhD in Linguistics all from Stanford University. He founded the Pr(Ai)²R Group, which continues the effort started in his PhD thesis on uncovering and inducing interpretable causal structures in deep learning models.
Amir Zur, Research Scholar
Amir is a recent graduate with a B.S. and M.S. from Stanford University, where he wrote an award winning honors thesis on interpretable, debiased, and accessible language models. With the Pr(Ai)²R Group, he contributed to our treatise on causal abstraction for mechanistic interpretability and adapted parts of his honors thesis to an EMNLP paper on updating the CLIP model to be more useful in an accessibility setting.
Jiuding Sun, Research Scholar
Jiuding is an incoming PhD student at Stanford University who recently graduated from Northeastern University. Currently, he is working on automating supervised interpretability methods using hypernetworks as interpretability agents.
Nikhil Prakash, Research Scholar
Nikhil is a second year PhD student at Northeastern University, advised by Professor David Bau. His interest lies in understanding the internal mechanisms of deep neural networks to enhance human-AI collaboration and prevent misalignment. Currently, he is investigating the mechanisms underlying theory of mind capabilities in large language models.
Mara Pilser, Research Scholar
Mara is a Master’s student at the University of Amsterdam advised by Sara Magliacane. Currently, she is investigated how to modify interpretability hypotheses to be more faithful to the underlying AI model.
Maheep Chaudhaury, Scholar
Maheep is a master's student of Artificial Intelligence at NTU and has five years of research in artificial intelligence. Over the past he has worked for two years in the field of causality to produce an extensive survey on causality and AI. With the Pr(Ai)²R Group, he contributed to our treatise on causal abstraction for mechanistic interpretability and lead a paper on evaluating sparse autoencoders on disentangling factual knowledge in GPT-2 Small
Sonakshi Chauhan, Research Scholar
Sonakshi is in the final year of her undergraduate, with research contributions at the Indian Institute of Science and Carnegie Mellon University and victories at national and international hackathons. She also has a passion for public speaking and philosophy. With the Pr(Ai)²R Group, she contributed to our treatise on causal abstraction for mechanistic interpretability.
Yiwei Wang, Research Scholar
Yiwei is a recent graduate with an M.S. from Columbia University and a B.S. from Carnegie Mellon University. Currently, he is investigating how Transformer-based language models perform variable assignment when evaluated on synthetic programs.
Aruna Sankaranarayanan, Research Scholar
Aruna is a third-year PhD student at MIT working with Prof. Dylan Hadfield-Menell. She is interested in using interpretability methods to understand and improve model and human behaviours.