Offline Benchmarking Service

Announcements

[Aug 15, 2025] Our EvalAI evaluation servers are currently under maintenance. We apologize for the inconvenience.

[Sep 10, 2025] For benchmarking needs, we’re offering an offline evaluation service. Please read the instructions below carefully. Please follow our website announcement for updates about the online evaluation server.

Supported Dataset Benchmarks

We accept requests for the following challenges:

  1. Answer visual questions
  2. Ground answers
  3. Recognize visual questions with multiple answer groundings
  4. Locate objects in few-shot learning scenarios
  5. Classify images in a zero-shot setting

Submit A Request

To submit a request, email vizwiz.ivc@gmail.com with:

  • Challenge name you’d like to benchmark
  • Prediction file(s) attached in the exact format specified on the challenge webpage

Benchmarking Results

  • We’ll reply to your original email with the results.
  • Results will not be publicly posted by us. You may cite them in academic papers or technical reports.
  • Metrics will match the test-standard phase shown on the leaderboard.
    • Example: Accuracy: 0.91, Recall: 0.72, F1: 0.85
  • If there are formatting or other issues, we will return the error messages, similar to the format of the stdout files on EvalAI. We cannot provide metric results when errors occur.
  • This offline service is not for workshop competitions.

Turnaround: Please allow up to one week for processing. Thank you for your patience.

For any questions regarding the offline benchmarking service, please email us at vizwiz.ivc@gmail.com. Thank you for your understanding!