Mahsa Khoshnoodi

PhD Student in Computer Science
Georgetown University · GUCV Lab · Advised by Dr. Sarah Adel Bargal · Previously collaborated with Dr. Michael Saxon (UCSB)

Not all seeing is understanding: when a vision-language model looks at an image, does it reason or does it pattern-match? I build diagnostic frameworks that locate exactly where and why VLMs fail on fine-grained visual understanding, treating hallucination as a reasoning failure rather than an output artifact. My broader vision is that truly capable VLMs will need more than perception: a structured world model that captures how the visual world works, bridging the gap between seeing, understanding, and acting.

I am actively seeking Research Internships in multimodal AI and visual reasoning. Feel free to reach out by email (mk2524@georgetown.edu) or use the icons under my bio at the top of the page.

Mahsa Khoshnoodi
Research Interests

Perception to Reasoning

I investigate how VLMs integrate visual and linguistic information to arrive at decisions. My core observation: even when models reach correct conclusions, their internal reasoning paths are often flawed or biased. I build interpretability tools that function as a microscope for AI, tracing information flow and exposing where perception fails to become genuine reasoning.

Diagnostic Frameworks for VLMs

I develop evaluation frameworks that assess not just whether a model is correct, but whether its reasoning process is valid. Hallucination and bias manifest heterogeneously across layers and architectures, so effective diagnosis requires understanding internal dynamics rather than observing outputs alone.

From Seeing to Acting

My long-term research targets AI systems that perceive, reason, and act reliably in the world. Drawing on ideas from structured world modeling and vision-language-action architectures, I aim to build multimodal systems that remain aligned with human values across the full loop from visual input to real-world decision.

Publications
CVPR-VisCon 2026
Do VLMs Reason About Faces? Probing the Perception-Reasoning Gap in Identity Judgment
Mahsa Khoshnoodi, Sarah Adel Bargal
Third Workshop on Visual Concepts (VisCon), CVPR 2026
SemEval 2026
GUNLP at SemEval-2026 Task 10: Emotion-Aware Multi-Task Learning for Conspiracy Detection
Mahsa Khoshnoodi, Rojin Ziaei, Nazli Goharian
SemEval 2026
KDD 2025
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models
Devichand Budagam, Sankalp KJ, Mahsa Khoshnoodi, Ashutosh Kumar, Vinija Jain, Aman Chadha
KDD 2025
NeurIPS 2024 Spotlight, top 5%
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)
Mahsa Khoshnoodi, Fatima Jahara, Michael Saxon, Yujie Lu, Aditya Sharma, William Yang Wang
NeurIPS 2024
arXiv 2024
A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
Mahsa Khoshnoodi, Vinija Jain, Mingye Gao, Malavika Srikanth, Aman Chadha
Preprint, 2024
Under Review
Reimagining Neurosymbolic AI through the Lens of Cognitive Science: A Survey
Devichand Budagam, Mahsa Khoshnoodi, Jibesh Patra, Ravid Shwartz-Ziv, Amit Sheth, Vinija Jain, Aman Chadha
Under review, ACM Computing Surveys
News
May 2026
Paper accepted at CVPR 2026 VisCon Workshop: "Do VLMs Reason About Faces?"
Apr 2026
Paper accepted at SemEval 2026: Emotion-Aware Multi-Task Learning for Conspiracy Detection.
Aug 2025
Started PhD at Georgetown University, joining the GUCV Lab under Dr. Sarah Adel Bargal.
Dec 2024
NeurIPS Spotlight: T2IScoreScore recognized as a Spotlight paper (top 5%).
Feb 2025
KDD-25: Hierarchical Prompting Taxonomy paper published.