Overview

A common goal in computer vision research is to build machines that can replicate the human vision system; for example, to recognize and describe objects/scenes. A natural grand challenge for the computer vision community is to design such technology to assist people who are blind to overcome their real daily visual challenges. Towards this aim, we introduce the first datasets and artificial intelligence challenges originating from people who are blind to encourage a larger community to collaborate on developing algorithms for assistive technologies. In particular, we built the datasets with data submitted by users of a mobile phone application, who each took a picture and (optionally) recorded a spoken question about that picture. Ultimately, we hope this work will educate more people about the technological needs of people who are blind while providing an exciting new opportunity for researchers to develop assistive technologies that eliminate their accessibility barriers.

Publications

  • Why Does a Visual Question Have Different Answers?
    Nilavra Bhattacharya, Qing Li, and Danna Gurari. IEEE International Conference on Computer Vision (ICCV), 2019.
    Dataset | Paper
  • VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People
    Danna Gurari, Qing Li, Chi Lin, Yinan Zhao, Anhong Guo, Abigale J. Stangl, and Jeffrey P. Bigham. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
    Dataset | Paper
  • VizWiz Grand Challenge: Answering Visual Questions from Blind People
    Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P. Bigham. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
    Dataset | Paper
  • Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s)
    Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, Kristen Grauman. International Journal of Computer Vision (IJCV), 2018.
    Paper
  • Visual Question Answer Diversity
    Chun-Ju Yang, Kristen Grauman, Danna Gurari. AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2018.
    Paper
  • CrowdVerge: Predicting If People Will Agree on the Answer to a Visual Question
    Danna Gurari, Kristen Grauman. ACM Conference on Human Factors in Computing Systems (CHI), 2017.
    Paper
  • Answering Visual Questions with Conversational Crowd Assistants
    Walter S. Lasecki, Phyo Thiha, Yu Zhong, Erin Brady, and Jeffrey P. Bigham. ACM Conference on Computers and Accessibility (ASSETS), 2013.
    Paper
  • Visual Challenges in the Everyday Lives of Blind People
    Erin Brady, Meredith R. Morris, Yu Zhong, Samuel White, and Jeffrey P. Bigham. ACM Conference on Human Factors in Computing Systems (CHI), 2013.
    Paper
  • Crowdsourcing Subjective Fashion Advice Using VizWiz: Challenges and Opportunities
    Michele A. Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P. Bigham, and Amy Hurst. ACM Conference on Computers and Accessibility (ASSETS), 2012.
    Paper
  • VizWiz: Nearly Real-time Answers to Visual Questions
    Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samuel White, and Tom Yeh. ACM User Interface Software and Technology Symposium (UIST), 2010.
    Paper



Website: Nilavra Bhattacharya, University of Texas at Austin