2026 VizWiz Grand Challenge Workshop

Overview

Our goal for this workshop is to educate researchers about the technological needs of people with vision impairments while empowering researchers to improve algorithms to meet these needs. A key component of this event will be to track progress on four dataset challenges, where the tasks are to locate objects in few-shot learning scenarios, ground all answers, locate all plausible region of focus, and locate and track object and part instances. The second key component of this event will be a discussion about current research and application issues, including invited speakers from both academia and industry who will share their experiences in building today’s state-­of-the-­art assistive technologies as well as designing next-generation tools.

Banner illustrating VizWiz Dataset Challenge tasks in four columns: (1) Few-shot localization of private objects, with examples like an empty pill bottle and a bank statement. (2) Grounding all answers, showing questions with multiple valid answers (e.g., identifying sugar type or airplane type). (3) Grounding all focus regions for visual questions, highlighting multiple regions linked to different answers (e.g., clock times, cleaning products, bird location). (4) Hierarchical instance tracking, showing sequential images tracking a pill bottle across views.

Important Dates

  • Challenges go live: Friday, Feb 27 (11:59pm AoE)
  • Challenge submissions due: Friday, May 1 (11:59pm AoE)
  • Abstract submissions due: Friday, May 8 (11:59pm AoE)
  • Abstract acceptance notifications: Friday, May 15 (11:59pm AoE)
  • Half-day Workshop: Thursday, June 4

Submissions

We invite two types of submissions:

Challenge Submissions

We invite submissions about algorithms for the following four challenge tasks: locate objects in few-shot learning scenarios, ground all answers, locate all plausible region of focus, and locate and track object and part instances. We accept submissions for algorithms that are not published, currently under review, and already published.

The teams with the top-performing submissions will be invited to give short talks during the workshop.

Extended Abstracts

We invite submissions of extended abstracts on topics related to all challenge tasks as well as assistive technologies for people with visual impairments. Papers must be at most two pages (with references) and follow the CVPR format using the provided author kit. Reviewing will be single-blind and accepted papers will be presented as posters. We will accept submissions on work that is not published, currently under review, and already published. There will be no proceedings. Please send your extended abstracts to workshop@vizwiz.org.

Please note that we will require all camera-ready content to be accessible via a screen reader. Given that making accessible PDFs and presentations may be a new process for some authors, we will host training sessions beforehand to both educate and assist all authors to succeed in making their content accessible.

Challenge Results

  • Few-Shot Localization of Private Object
    1st place: 72.98: Ping-Lun Lee, Yun-Ching Kao, Cheng-Kuan Lin, Yu-Chee Tseng (National Yang Ming Chiao Tung University)
    2nd place: 63.82: Heeseung Cho, Yuna Park, Esther Kim, Jiho Kim, Hyoju Kim, Christian Wallraven, Junhyoung Oh (Korea University & Seoul Women’s University)
    3rd place: 0.52: Zihan Zhai, Tingting Li, Zhenyu Zhao, Xu Liu, Shuo Li, Fang Liu (Xidian University)
  • Grounding All Valid Answers
    1st place: 86.32: Sicong Li, Qianqian Xu, Zhiyong Yang, Zitai Wang, Qingming Huang (CAS & University of Chinese Academy of Sciences)
    2nd place: 85.12: Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen (Chinese Academy of Sciences)Grounding All Focus Regions for Visual Questions
  • 1st place: 5.92: Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen (Chinese Academy of Sciences)

Program

Location:

Room 709, Colorado Convention Center [map]
Address: 700 14th St, Denver, CO 80202

Schedule:

  • 8:45-9:00am: Opening remarks and winner announcements for three challenges (Few-Shot Localization of Private Object, Answer Grounding, and Grounding Focus Regions)
  • 9:00-9:30am: Invited talk and Q&A with Kate Saenko, AI research scientist at Meta
    • Talk title: SAM 3: Segment Anything Model
  • 9:30-10:00am: Invited talk and Q&A with Shaun Kane, research scientist in responsible AI at Google Research
    • Talk title: Disabled People & Disabled Data
  • 10:00-10:15am: Poster spotlight talks
  • 10:15-10:30am: Break
  • 10:30-11:00am: Invited talk and Q&A with Cordelia Schmid, research director at INRIA
    • Talk title: Grounded and Efficient Video Understanding
  • 11-11:30am: Invited talk and Q&A with Ramin Ayanzadeh, assistant professor in quantum computing and machine intelligence at University of Colorado Boulder
    • Talk title: Parallel Lives Through the Same Eyes
  • 11:30-12:15pm: Open Q&A panel with four invited speakers
  • 12:15-12:20pm: Closing remarks
  • 12:20-1:00pm: Poster session

Poster List:

  • Agent2Seg: Agentic VLM-Guided Few-Shot Object Localization and Segmentation for Accessibility-Oriented Images
    Ping-Lun Lee*, Yun-Ching Kao*, Cheng-Kuan Lin, Yu-Chee Tseng
    Paper
  • Resolving Perceptual Divergence via Multi-Output Formatting
    Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen
    Paper
  • Region-Enhanced Single-Grounding Prediction for VizWiz VQA
    Sicong Li, Qianqian Xu, Zhiyong Yang, Zitai Wang, Qingming Huang
    Paper
  • VLM-Guided Detection and Rematching for Private Object Localization
    Heeseung Cho, Yuna Park, Esther Kim, Jiho Kim, Hyojoo Kim
    Paper
  • Zero-Shot Focus Ambiguity Grounding via Multi-Output Formatting
    Hao Liang, Yuanhang Tao, Meina Kan, Shiguang Shan, Xilin Chen
    Paper

Invited Speakers:

Headshot of Cordelia Schmid

Cordelia Schmid
Research Director
INRIA

Headshot of Kate Saenko

Kate Saenko
Research Scientist/Professor
Meta/Boston University

Headshot of Shaun Kane

Shaun Kane
Research Scientist
Google

Headshot of Ramin Ayanzadeh

Ramin Ayanzadeh
Assistant Professor
University of Colorado Boulder

Organizers

Head shot of Danna Gurari

Danna Gurari
University of Colorado Boulder

Head shot of Jeffrey Bigham

Jeffrey Bigham
Carnegie Mellon University, Apple

Head shot of Ed Cutrell

Ed Cutrell
Microsoft

Head shot of Neelima Prasad

Neelima Prasad
University of Colorado Boulder

Head shot of Zhuoheng Li

Zhuoheng Li
University of Colorado Boulder

Contact Us

For questions, comments, or feedback, please send them to Danna Gurari at danna.gurari@colorado.edu.