Mahsa Khoshnoodi

PhD Student in Computer Science

Georgetown University · GUCV Lab ·
I am fortunate to be advised by Prof. Sarah Adel Bargal at Georgetown University, and to have been mentored by Dr. Michael Saxon.

Not all seeing is understanding: when a vision-language model looks at an image, does it reason or does it pattern-match? I build diagnostic frameworks that locate exactly where and why VLMs fail on fine-grained visual understanding, treating hallucination as a reasoning failure rather than an output artifact. My broader vision is that truly capable VLMs will need more than perception: a structured world model that captures how the visual world works, bridging the gap between seeing, understanding, and acting.

I am actively seeking Research Internships in multimodal AI and visual reasoning. Feel free to reach out by email (mk2524@georgetown.edu) or use the icons under my bio at the top of the page.

Research Interests

Perception to Reasoning

I investigate how VLMs integrate visual and linguistic information to arrive at decisions. My core observation: even when models reach correct conclusions, their internal reasoning paths are often flawed or biased. I build interpretability tools that function as a microscope for AI, tracing information flow and exposing where perception fails to become genuine reasoning.

Diagnostic Frameworks for VLMs

I develop evaluation frameworks that assess not just whether a model is correct, but whether its reasoning process is valid. Hallucination and bias manifest heterogeneously across layers and architectures, so effective diagnosis requires understanding internal dynamics rather than observing outputs alone.

From Seeing to Acting

My long-term research targets AI systems that perceive, reason, and act reliably in the world. Drawing on ideas from structured world modeling and vision-language-action architectures, I aim to build multimodal systems that remain aligned with human values across the full loop from visual input to real-world decision.

Publications

CVPR-VisCon 2026

Do VLMs Reason About Faces? Probing the Perception-Reasoning Gap in Identity Judgment

Mahsa Khoshnoodi, Sarah Adel Bargal

Third Workshop on Visual Concepts (VisCon), CVPR 2026

SemEval 2026

GUNLP at SemEval-2026 Task 10: Emotion-Aware Multi-Task Learning for Conspiracy Detection

Mahsa Khoshnoodi, Rojin Ziaei, Nazli Goharian

SemEval 2026

KDD 2025

Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models

Devichand Budagam, Sankalp KJ, Mahsa Khoshnoodi, Ashutosh Kumar, Vinija Jain, Aman Chadha

KDD 2025

Paper

NeurIPS 2024 Spotlight, top 5%

Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)

Mahsa Khoshnoodi, Fatima Jahara, Michael Saxon, Yujie Lu, Aditya Sharma, William Yang Wang

NeurIPS 2024

Paper Project Page

arXiv 2024

A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models

Mahsa Khoshnoodi, Vinija Jain, Mingye Gao, Malavika Srikanth, Aman Chadha

Preprint, 2024

arXiv

Under Review

Reimagining Neurosymbolic AI through the Lens of Cognitive Science: A Survey

Devichand Budagam, Mahsa Khoshnoodi, Jibesh Patra, Ravid Shwartz-Ziv, Amit Sheth, Vinija Jain, Aman Chadha

Under review, ACM Computing Surveys

News

May 2026

Paper accepted at CVPR 2026 VisCon Workshop: "Do VLMs Reason About Faces?"

Apr 2026

Paper accepted at SemEval 2026: Emotion-Aware Multi-Task Learning for Conspiracy Detection.

Aug 2025

Started PhD at Georgetown University, joining the GUCV Lab under Dr. Sarah Adel Bargal.

Feb 2025

KDD-25: Hierarchical Prompting Taxonomy paper published.

Dec 2024

NeurIPS Spotlight: T2IScoreScore recognized as a Spotlight paper (top 5%).