Asma Ghandeharioun

Senior Research Scientist, Google DeepMind, NYC

prof_pic.jpg

I am Asma Ghandeharioun, a senior research scientist in Google DeepMind. I work on aligning AI with human values through better understanding [1] and controlling (language) models [2], and uniquely by demystifying their inner workings [3] and correcting collective misconceptions along the way [4, 5].

I received my Ph.D. from the Affective Computing Group, MIT Media Lab. I am fortunate to have had Roz as my advisor. In addition, I have had research experiences at Google Research, Microsoft Research, and EPFL, many of which have evolved into exciting long-term collaborations.

In a previous life, I conducted research in the digital mental health and wellbeing space, collaborated with medical professionals from Harvard and renowned hospitals in the Boston area such as Massachusetts General Hospital (MGH) and Brigham and Women’s Hospital (BWH), and published in venues such as Frontiers in Psychiatry and Psychology of Well-being. This area remains close to my heart, and I occasionally dabble in it during my free time.

You can download my résumé here.

Selected Publications

  1. patchscopes.png
    Patchscopes: A unifying framework for inspecting hidden representations of language models
    Asma Ghandeharioun*, Avi Caciularu* , Adam Pearce , Lucas Dixon , and Mor Geva
    International Conference on Machine Learning (ICML), to appear, 2024
  2. localization.png
    Does localization inform editing? surprising differences in causality-based localization vs. knowledge editing in language models
    Peter Hase , Mohit Bansal , Been Kim , and Asma Ghandeharioun
    Advances in Neural Information Processing Systems (NeurIPS), 2023
    (Spotlight)
  3. simplification.png
    Interpretability illusions in the generalization of simplified models
    Dan Friedman , Andrew Kyle Lampinen , Lucas Dixon , Danqi Chen , and Asma Ghandeharioun
    International Conference on Machine Learning (ICML), to appear, 2024
  4. grok.gif
    Do machine learning models memorize or generalize
    Adam Pearce , Asma Ghandeharioun, Nada Hussein , Nithum Thain , Martin Wattenberg , and Lucas Dixon
    In IEEE VISxAI , 2023
    (Best paper)
  5. AMPLIFY.png
    Post Hoc Explanations of Language Models Can Improve Language Models
    Satyapriya Krishna , Jiaqi Ma , Dylan Slack , Asma Ghandeharioun, Sameer Singh , and Himabindu Lakkaraju
    In Advances in Neural Information Processing Systems (NeurIPS) , 2023
  6. dissect.jpg
    DISSECT: Disentangled simultaneous explanations via concept traversals
    Asma Ghandeharioun, Been Kim , Chun-Liang Li , Brendan Jou , Brian Eoff , and Rosalind W Picard
    In International Conference on Learning Representations (ICLR) , 2021
  7. correlations_table.jpg
    Approximating interactive human evaluation with self-play for open-domain dialog systems
    Asma Ghandeharioun*, Judy Hanwen Shen* , Natasha Jaques* , Craig Ferguson , Noah Jones , Agata Lapedriza , and Rosalind W Picard
    In Advances in Neural Information Processing Systems (NeurIPS) , 2019
  8. thesis.png
    Towards Human-Centered Optimality Criteria
    Asma Ghandeharioun
    Massachusetts Institute of Technology , 2021