MedBLINK: Probing Basic Perception in Multimodal Language Models for Medicine

1University of Washington, 2Massachusetts Institute of Technology 3Seoul National University

MedBLINK consists of 8 visual tasks that medical professionals could solve within a blink but MLMs struggle. These tasks cover a range of clinically relevant problems, including anatomical orientation, morphology qualification, visual and wave-based depth estimation, and histology analysis.

Leaderboard

Abstract

Multimodal language models (MLMs) show promise for clinical decision support and diagnostic reasoning, raising the prospect of end-to-end automated medical image interpretation. However, clinicians are highly selective in adopting AI tools; a model that makes errors on seemingly simple perception tasks such as determining image orientation or identifying whether a CT scan is contrast-enhanced—are unlikely to be adopted for clinical tasks. We introduce MedBLINK, a benchmark designed to probe these models for such perceptual abilities. MedBLINK spans eight clinically meaningful tasks across multiple imaging modalities and anatomical regions, totaling 1,429 multiple-choice questions over 1,605 images.

We evaluate 19 state-of-the-art MLMs, including general-purpose (GPT‑4o, Claude 3.5 Sonnet) and domain-specific (Med-Flamingo, LLaVA-Med, RadFM) models. While human annotators achieve 96.4% accuracy, the best-performing model reaches only 65%. These results show that current MLMs frequently fail at routine perceptual checks, suggesting the need to strengthen their visual grounding to support clinical adoption.

MedBLINK characteristics

MedBLINK examples with models' outputs


Related Work

BibTeX


      @misc{bigverdi2025medblinkprobingbasicperception,
        title={MedBLINK: Probing Basic Perception in Multimodal Language Models for Medicine}, 
        author={Mahtab Bigverdi and Wisdom Ikezogwo and Kevin Zhang and Hyewon Jeong and Mingyu Lu and Sungjae Cho and Linda Shapiro and Ranjay Krishna},
        year={2025},
        eprint={2508.02951},
        archivePrefix={arXiv},
        primaryClass={cs.AI},
        url={https://arxiv.org/abs/2508.02951},}