VizWiz – Algorithms to assist people who are blind

Overview

A common goal in computer vision research is to build machines that can replicate the human vision system; for example, to recognize and describe objects/scenes. A natural grand challenge for the computer vision community is to design such technology to assist people with vision impairments to overcome their real daily visual challenges. Towards this aim, we introduce the first datasets and artificial intelligence challenges originating from people with vision impairments to encourage a larger community to collaborate on developing algorithms for assistive technologies. In particular, we built the datasets with data submitted by users of a mobile phone application, who each took a picture and (optionally) recorded a spoken question about that picture. Ultimately, we hope this work will educate more people about the technological needs of people with vision impairments while providing an exciting new opportunity for researchers to develop assistive technologies that eliminate their accessibility barriers.

Publications

Acknowledging Focus Ambiguity in Visual Questions
Chongyan Chen, Yu-Yun Tseng, Zhuoheng Li, Anush Venkatesh, Danna Gurari. IEEE International Conference on Computer Vision (ICCV), 2025.
Dataset | Paper
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
Yu-Yun Tseng, Tanusree Sharma, Lotus Zhang, Abigale Stangl, Leah Findlater, Yang Wang, Danna Gurari. IEEE Winter Conference on Applications of Computer Vision (WACV), 2025.
Dataset | Paper

Right this way: Can VLMs Guide Us to See More to Answer Questions?
Li Liu, Diji Yang, Sijia Zhong, Kalyana Suma Sree Tholeti, Lei Ding, Yi Zhang, Leilani H. Gilpin. NeurIPS, 2024.
Dataset | Paper

Long-Form Answers to Visual Questions from Blind and Low Vision People
Mina Huh, Fangyuan Xu, Yi-Hao Peng, Chongyan Chen, Hansika Murugu, Danna Gurari, Eunsol Choi, Amy Pavel. Conference on Language Modeling (COLM), 2024.
Website | Paper

Salient Object Detection for Images Taken by People With Vision Impairments
Jarek Reynolds, Chandra Kanth Nagesh, and Danna Gurari. IEEE Winter Conference on Applications of Computer Vision (WACV), 2024.
Dataset | Paper

VQA Therapy: Exploring Answer Differences by Visually Grounding Answers
Chongyan Chen, Samreen Anjum, and Danna Gurari. IEEE International Conference on Computer Vision (ICCV), 2023.
Dataset | Paper

A New Dataset Based on Images Taken by Blind People for Testing the Robustness of Image Classification Models Trained for ImageNet Categories
Reza Akbarian Bafghi, and Danna Gurari. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Dataset | Paper

Helping Visually Impaired People Take Better Quality Pictures
Maniratnam Mandal, Deepti Ghadiyaram, Danna Gurari, and Alan C. Bovik. IEEE Transactions on Image Processing (T-IP), 2023.
Paper
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments
Yu-Yun Tseng, Alexander Bell, and Danna Gurari. European Conference on Computer Vision (ECCV), 2022.
Dataset | Paper
Grounding Answers for Visual Questions Asked by Visually Impaired People (Oral Presentation)
Chongyan Chen, Samreen Anjum, and Danna Gurari. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Dataset | Paper
Quality of Images Showing Medication Packaging from Individuals with Vision Impairments: Implications for the Design of Visual Question Answering Applications (SIG-USE Innovation)
Nathan Davis, Danna Gurari, and Bo Xie. Association for Information Science and Technology (ASIS&T), 2020.
Paper
Vision Skills Needed to Answer Visual Questions (Honorable Mention Award)
Xiaoyu Zeng, Yanan Wang, Tai-Yin Chiu, Nilavra Bhattacharya and Danna Gurari. Proceedings of the ACM on Human-Computer Interaction (CSCW), 2020.
Dataset | Paper
“I Hope This Is Helpful”: Understanding Crowdworkers’ Challenges and Motivations for an Image Description Task
Rachel Simons, Danna Gurari, Kenneth R. Fleischmann. Proceedings of the ACM on Human-Computer Interaction (PACM HCI), 2020.
Paper
Captioning Images Taken by People Who Are Blind
Danna Gurari, Yinan Zhao, Meng Zhang, and Nilavra Bhattacharya. European Conference on Computer Vision (ECCV), 2020.
Dataset | Paper
Assessing Image Quality Issues for Real-World Problems
Tai-Yin Chiu, Yinan Zhao, and Danna Gurari. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
Dataset | Paper
Why Does a Visual Question Have Different Answers?
Nilavra Bhattacharya, Qing Li, and Danna Gurari. IEEE International Conference on Computer Vision (ICCV), 2019.
Dataset | Paper
VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People
Danna Gurari, Qing Li, Chi Lin, Yinan Zhao, Anhong Guo, Abigale J. Stangl, and Jeffrey P. Bigham. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Dataset | Paper
VizWiz Grand Challenge: Answering Visual Questions from Blind People
Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P. Bigham. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
Dataset | Paper
Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s)
Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, and Kristen Grauman. International Journal of Computer Vision (IJCV), 2018.
Paper
Visual Question Answer Diversity
Chun-Ju Yang, Kristen Grauman, and Danna Gurari. AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2018.
Paper
CrowdVerge: Predicting If People Will Agree on the Answer to a Visual Question
Danna Gurari and Kristen Grauman. ACM Conference on Human Factors in Computing Systems (CHI), 2017.
Paper
Answering Visual Questions with Conversational Crowd Assistants
Walter S. Lasecki, Phyo Thiha, Yu Zhong, Erin Brady, and Jeffrey P. Bigham. ACM Conference on Computers and Accessibility (ASSETS), 2013.
Paper
Visual Challenges in the Everyday Lives of Blind People
Erin Brady, Meredith R. Morris, Yu Zhong, Samuel White, and Jeffrey P. Bigham. ACM Conference on Human Factors in Computing Systems (CHI), 2013.
Paper
Crowdsourcing Subjective Fashion Advice Using VizWiz: Challenges and Opportunities
Michele A. Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P. Bigham, and Amy Hurst. ACM Conference on Computers and Accessibility (ASSETS), 2012.
Paper
VizWiz: Nearly Real-time Answers to Visual Questions
Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samuel White, and Tom Yeh. ACM User Interface Software and Technology Symposium (UIST), 2010.
Paper