Overview

A common goal in computer vision research is to build machines that can replicate the human vision system; for example, to recognize and describe objects/scenes. A natural grand challenge for the computer vision community is to design such technology to assist people with vision impairments to overcome their real daily visual challenges. Towards this aim, we introduce the first datasets and artificial intelligence challenges originating from people with vision impairments to encourage a larger community to collaborate on developing algorithms for assistive technologies. In particular, we built the datasets with data submitted by users of a mobile phone application, who each took a picture and (optionally) recorded a spoken question about that picture. Ultimately, we hope this work will educate more people about the technological needs of people with vision impairments while providing an exciting new opportunity for researchers to develop assistive technologies that eliminate their accessibility barriers.

Publications

  • Long-Form Answers to Visual Questions from Blind and Low Vision People
    Mina Huh, Fangyuan Xu, Yi-Hao Peng, Chongyan Chen, Hansika Murugu, Danna Gurari, Eunsol Choi, Amy Pavel. Conference on Language Modeling (COLM), 2024.
    Website | Paper
  • Salient Object Detection for Images Taken by People With Vision Impairments
    Jarek Reynolds, Chandra Kanth Nagesh, and Danna Gurari. IEEE Winter Conference on Applications of Computer Vision (WACV), 2024.
    Dataset | Paper
  • VQA Therapy: Exploring Answer Differences by Visually Grounding Answers
    Chongyan Chen, Samreen Anjum, and Danna Gurari. IEEE International Conference on Computer Vision (ICCV), 2023.
    Dataset | Paper
  • A New Dataset Based on Images Taken by Blind People for Testing the Robustness of Image Classification Models Trained for ImageNet Categories
    Reza Akbarian Bafghi, and Danna Gurari. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
    Dataset | Paper
  • Helping Visually Impaired People Take Better Quality Pictures
    Maniratnam Mandal, Deepti Ghadiyaram, Danna Gurari, and Alan C. Bovik.  IEEE Transactions on Image Processing (T-IP), 2023.
    Paper
  • VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments
    Yu-Yun Tseng, Alexander Bell, and Danna Gurari. European Conference on Computer Vision (ECCV), 2022.
    Dataset | Paper
  • Grounding Answers for Visual Questions Asked by Visually Impaired People (Oral Presentation)
    Chongyan Chen, Samreen Anjum, and Danna Gurari. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
    Dataset | Paper
  • Quality of Images Showing Medication Packaging from Individuals with Vision Impairments: Implications for the Design of Visual Question Answering Applications (SIG-USE Innovation)
    Nathan Davis, Danna Gurari, and Bo Xie. Association for Information Science and Technology (ASIS&T), 2020.
    Paper
  • Vision Skills Needed to Answer Visual Questions (Honorable Mention Award)
    Xiaoyu Zeng, Yanan Wang, Tai-Yin Chiu, Nilavra Bhattacharya and Danna Gurari. Proceedings of the ACM on Human-Computer Interaction (CSCW), 2020.
    Dataset | Paper
  • “I Hope This Is Helpful”: Understanding Crowdworkers’ Challenges and Motivations for an Image Description Task
    Rachel Simons, Danna Gurari, Kenneth R. Fleischmann. Proceedings of the ACM on Human-Computer Interaction (PACM HCI), 2020.
    Paper
  • Captioning Images Taken by People Who Are Blind
    Danna Gurari, Yinan Zhao, Meng Zhang, and Nilavra Bhattacharya. European Conference on Computer Vision (ECCV), 2020.
    Dataset | Paper
  • Assessing Image Quality Issues for Real-World Problems
    Tai-Yin Chiu, Yinan Zhao, and Danna Gurari. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
    Dataset | Paper
  • Why Does a Visual Question Have Different Answers?
    Nilavra Bhattacharya, Qing Li, and Danna Gurari. IEEE International Conference on Computer Vision (ICCV), 2019.
    Dataset | Paper
  • VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People
    Danna Gurari, Qing Li, Chi Lin, Yinan Zhao, Anhong Guo, Abigale J. Stangl, and Jeffrey P. Bigham. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
    Dataset | Paper
  • VizWiz Grand Challenge: Answering Visual Questions from Blind People
    Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P. Bigham. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
    Dataset | Paper
  • Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s)
    Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, and Kristen Grauman. International Journal of Computer Vision (IJCV), 2018.
    Paper
  • Visual Question Answer Diversity
    Chun-Ju Yang, Kristen Grauman, and Danna Gurari. AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2018.
    Paper
  • CrowdVerge: Predicting If People Will Agree on the Answer to a Visual Question
    Danna Gurari and Kristen Grauman. ACM Conference on Human Factors in Computing Systems (CHI), 2017.
    Paper
  • Answering Visual Questions with Conversational Crowd Assistants
    Walter S. Lasecki, Phyo Thiha, Yu Zhong, Erin Brady, and Jeffrey P. Bigham. ACM Conference on Computers and Accessibility (ASSETS), 2013.
    Paper
  • Visual Challenges in the Everyday Lives of Blind People
    Erin Brady, Meredith R. Morris, Yu Zhong, Samuel White, and Jeffrey P. Bigham. ACM Conference on Human Factors in Computing Systems (CHI), 2013.
    Paper
  • Crowdsourcing Subjective Fashion Advice Using VizWiz: Challenges and Opportunities
    Michele A. Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P. Bigham, and Amy Hurst. ACM Conference on Computers and Accessibility (ASSETS), 2012.
    Paper
  • VizWiz: Nearly Real-time Answers to Visual Questions
    Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samuel White, and Tom Yeh. ACM User Interface Software and Technology Symposium (UIST), 2010.
    Paper