Overview
Our goal for this workshop is to educate researchers about the technological needs of people with vision impairments while empowering researchers to improve algorithms to meet these needs. A key component of this event will be to track progress on a new dataset challenge, where the task is to caption images taken by people who are blind. Winners of this challenge will receive awards sponsored by Microsoft. The second key component of this event will be a discussion about current research and application issues, including by invited speakers from both academia and industry who will share about their experiences in building today’s state-of-the-art assistive technologies as well as designing next-generation tools.
Important Dates
- February: challenge submissions announced
Friday, April 24Friday, May 22 [5:59pm Central Standard Time]: extended abstracts dueMonday, May 4Friday, May 29 [5:59pm Central Standard Time]: notification to authors about decisions for extended abstractsFriday, May 15Monday, June 1 [5:59pm Central Standard Time]: challenge submissions due- Sunday, June 14: all-day workshop
Submissions
We invite two types of submissions:
Challenge Submissions
We invite submissions of results from algorithms for the image captioning challenge task. We accept submissions for algorithms that are not published, currently under review, and already published. The teams with the top-performing submissions will be invited to give short talks during the workshop. The top three teams will receive financial awards sponsored by Microsoft:
-
-
- 1rst place: $10,000 Microsoft Azure credit
- 2nd place: $10,000 Microsoft Azure credit
- 3rd place: $10,000 Microsoft Azure credit
-
Extended Abstracts
- We invite submissions of extended abstracts on topics related to image captioning and assistive technologies for people with visual impairments. Papers must be at most two pages (with references) and follow the CVPR formatting guidelines using the provided author kit. Reviewing will be single-blind and accepted papers will be presented as posters. We accept submissions on work that is not published, currently under review, and already published. There will be no proceedings. Please send your extended abstracts to workshop@vizwiz.org.
Program
Location:
Event is being held virtually.
Schedule:
- 9:00-9:10am: Opening remarks by Danna Gurari (video)
- 9:10-9:30am: Invited speaker Meredith Morris (video)
- 9:30-9:50am: Invited speaker Anirudh Koul (video)
- 9:50-10:10am: Invited speaker Chieko Asakawa
- 10:10-10:30am: Invited speaker Shiri Azenkot (video)
- 10:30-10:45am: Break
- 10:45-11:30am: Live panel with invited speakers from the morning (video)
- 11:30-12:30pm: Lunch break
- 12:30-12:40pm: Overview of challenge, winner announcements, and analysis of results by Yinan Zhao (video)
- 12:40-12:55pm: Talks by top-3 teams on the dataset challenge
- 12:55-1:30pm: Poster session (For interactive Q&A with authors, please click the Q&A link for each paper listed in the Poster List below. CVPR 2020 registration required for Q&A.)
- 1:30-2:30pm: Panel with blind technology advocates Cynthia Bennett, Chancey Fleet, and Venkatesh Potluri (video)
- 2:30-2:50pm: Break
- 2:50-3:10pm: Invited speaker Peter Anderson (video)
- 3:10-3:30pm: Invited speaker Kate Saenko (video)
- 3:30-3:45pm: Break
- 3:45-4:30pm: Live panel with invited speakers from the afternoon (video)
- 4:30-4:50pm: Open discussion
- 4:50-5:00pm: Closing remarks by Danna Gurari (video)
Invited Speakers:
Venkatesh Potluri
University of Washington
Cynthia Bennett
Carnegie Mellon University, Apple
Chancey Fleet
New York Public Library, Data and Society Research Institute
Poster List:
- Self-Critical Sequence Training for Image Captioning using Bayesian “baseline”
Shashank Bujimalla*, Mahesh Subedar*, Omesh Tickoo
paper | video - Uncertainty quantification in image captioning models
Shashank Bujimalla*, Mahesh Subedar*, Omesh Tickoo
paper | video - Hybrid Information of Transformer for Image Captioning
Yuchen Ren, Ziqiang Chen, Jinyu Hu, Lei Chen
paper | video - Japanese Coins and Banknotes Recognition for Visually Impaired People
Huyen T. T. Bui, Man M. Ho, Xiao Peng, Jinjia Zhou
paper | video - Vizwiz Image Captioning based on AoANet with Scene Graph
Suwon Kim, HongYong Choi, JoongWon Hwang, JangYoung Song, SangRok Lee, TaeKang Woo
paper | video - On the use of human reference data for evaluating automatic image descriptions
Emiel van Miltenburg
paper | video - Alleviating Noisy Data in Image Captioning with Cooperative Distillation
Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff
paper | video - Exploring Weaknesses of VQA Models through Attribution Driven Insights
Shaunak Halbe
paper | video
Organizers
Danna Gurari
University of Texas at Austin
Jeffrey Bigham
Carnegie Mellon University, Apple
Meredith Morris
Microsoft
Ed Cutrell
Microsoft
Abigale Stangl
University of Texas at Austin
Yinan Zhao
University of Texas at Austin
Contact Us
For questions, comments, or feedback, please send them to Danna Gurari at danna.gurari@ischool.utexas.edu.