Overview
Our goal for this workshop is to educate researchers about the technological needs of people with vision impairments while empowering researchers to improve algorithms to meet these needs. A key component of this event will be to track progress on four dataset challenges, where the tasks are to locate objects in few-shot learning scenarios, ground all answers, locate all plausible region of focus, and locate and track object and part instances. The second key component of this event will be a discussion about current research and application issues, including invited speakers from both academia and industry who will share their experiences in building today’s state-of-the-art assistive technologies as well as designing next-generation tools.

Important Dates
- Challenges go live: Friday, Feb 27 (11:59pm AoE)
- Challenge submissions due: Friday, May 1 (11:59pm AoE)
- Abstract submissions due: Friday, May 8 (11:59pm AoE)
- Abstract acceptance notifications: Friday, May 15 (11:59pm AoE)
- Half-day Workshop: Thursday, June 4
Submissions
We invite two types of submissions:
Challenge Submissions
We invite submissions about algorithms for the following four challenge tasks: locate objects in few-shot learning scenarios, ground all answers, locate all plausible region of focus, and locate and track object and part instances. We accept submissions for algorithms that are not published, currently under review, and already published.
The teams with the top-performing submissions will be invited to give short talks during the workshop.
Extended Abstracts
We invite submissions of extended abstracts on topics related to all challenge tasks as well as assistive technologies for people with visual impairments. Papers must be at most two pages (with references) and follow the CVPR format using the provided author kit. Reviewing will be single-blind and accepted papers will be presented as posters. We will accept submissions on work that is not published, currently under review, and already published. There will be no proceedings. Please send your extended abstracts to workshop@vizwiz.org.
Please note that we will require all camera-ready content to be accessible via a screen reader. Given that making accessible PDFs and presentations may be a new process for some authors, we will host training sessions beforehand to both educate and assist all authors to succeed in making their content accessible.
Challenge Results
- Few-Shot Localization of Private Object
1st place: 72.98: Ping-Lun Lee, Yun-Ching Kao, Cheng-Kuan Lin, Yu-Chee Tseng (National Yang Ming Chiao Tung University)
2nd place: 63.82: Heeseung Cho, Yuna Park, Esther Kim, Jiho Kim, Hyoju Kim, Christian Wallraven, Junhyoung Oh (Korea University & Seoul Women’s University)
3rd place: 0.52: Zihan Zhai, Tingting Li, Zhenyu Zhao, Xu Liu, Shuo Li, Fang Liu (Xidian University) - Grounding All Valid Answers
1st place: 86.32: Sicong Li, Qianqian Xu, Zhiyong Yang, Zitai Wang, Qingming Huang (CAS & University of Chinese Academy of Sciences)
2nd place: 85.12: Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen (Chinese Academy of Sciences)Grounding All Focus Regions for Visual Questions - 1st place: 5.92: Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen (Chinese Academy of Sciences)
Program
Location:
Room 709, Colorado Convention Center [map]
Address: 700 14th St, Denver, CO 80202
Schedule:
- 8:45-9:00am: Opening remarks and winner announcements for three challenges (Few-Shot Localization of Private Object, Answer Grounding, and Grounding Focus Regions)
- 9:00-9:30am: Invited talk and Q&A with Kate Saenko, AI research scientist at Meta
- Talk title: SAM 3: Segment Anything Model
- 9:30-10:00am: Invited talk and Q&A with Shaun Kane, research scientist in responsible AI at Google Research
- Talk title: Disabled People & Disabled Data
- 10:00-10:15am: Poster spotlight talks
- 10:15-10:30am: Break
- 10:30-11:00am: Invited talk and Q&A with Cordelia Schmid, research director at INRIA
- Talk title: Grounded and Efficient Video Understanding
- 11-11:30am: Invited talk and Q&A with Ramin Ayanzadeh, assistant professor in quantum computing and machine intelligence at University of Colorado Boulder
- Talk title: Parallel Lives Through the Same Eyes
- 11:30-12:15pm: Open Q&A panel with four invited speakers
- 12:15-12:20pm: Closing remarks
- 12:20-1:00pm: Poster session
Poster List:
- Agent2Seg: Agentic VLM-Guided Few-Shot Object Localization and Segmentation for Accessibility-Oriented Images
Ping-Lun Lee*, Yun-Ching Kao*, Cheng-Kuan Lin, Yu-Chee Tseng
Paper
- Resolving Perceptual Divergence via Multi-Output Formatting
Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen
Paper
- Region-Enhanced Single-Grounding Prediction for VizWiz VQA
Sicong Li, Qianqian Xu, Zhiyong Yang, Zitai Wang, Qingming Huang
Paper
- VLM-Guided Detection and Rematching for Private Object Localization
Heeseung Cho, Yuna Park, Esther Kim, Jiho Kim, Hyojoo Kim
Paper
- Zero-Shot Focus Ambiguity Grounding via Multi-Output Formatting
Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen
Paper
Invited Speakers:
Organizers

Danna Gurari
University of Colorado Boulder

Jeffrey Bigham
Carnegie Mellon University, Apple

Ed Cutrell
Microsoft

Neelima Prasad
University of Colorado Boulder

Zhuoheng Li
University of Colorado Boulder
Contact Us
For questions, comments, or feedback, please send them to Danna Gurari at danna.gurari@colorado.edu.



