Add OVD local inference example by Travor278 · Pull Request #359 · om-ai-lab/VLM-R1

Travor278 · 2026-05-12T01:24:50Z

Description

Adds a small examples/ovd gallery for running the released VLM-R1 OVD checkpoint on local images.

The example includes:

a CLI inference script for omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321
the OVD prompt template used by the public demo flow
robust parsing for <answer> JSON, Markdown-fenced JSON, and optional json_repair
annotated image output plus detections.json, raw_output.txt, prompt.txt, and a single-case HTML report
a small gallery builder for comparing multiple local inference outputs
three bundled smoke-test images covering single-class, multi-class, and multi-object outputs

This is additive only and does not change training, evaluation, or model-loading defaults elsewhere in the repository.

Gallery Preview

The gallery below was generated locally with examples/ovd/build_gallery.py from multiple OVD output directories. The checked-in builder supports the same switchable layout for any number of --case entries.

Related Issue

Closes #200.

Related to #232 and #306 because the example supports multiple bounding boxes and comma-separated target labels. It may also help users debugging OVD demo setup issues such as #297, but it does not change the hosted demo or add OVDEval evaluation templates.

Motivation and Context

Several users asked for a runnable OVD inference path and the exact prompt shape needed to get bounding-box JSON from the released OVD model. Pointing users only to the hosted Space makes local debugging harder, especially when they need to inspect raw model output, parsed boxes, and the rendered result.

This example keeps the surface small: one local image in, annotated output and JSON artifacts out.

How Has This Been Tested?

Local environment:

Python 3.11
PyTorch 2.7.1+cu128
Transformers 5.8.0
NVIDIA GeForce RTX 5070 Laptop GPU, 8GB VRAM

Commands run:

python -m py_compile examples/ovd/infer_ovd.py examples/ovd/build_gallery.py

python examples/ovd/infer_ovd.py \
  --image examples/ovd/assets/person.jpg \
  --labels person \
  --output-dir outputs/ovd_person \
  --max-memory "cuda:7GiB,cpu:24GiB" \
  --local-files-only

Result: parsed 4 detections.

python examples/ovd/infer_ovd.py \
  --image examples/ovd/assets/drinks_fruit.jpg \
  --labels "drink,fruit" \
  --output-dir outputs/ovd_drinks_fruit \
  --max-memory "cuda:7GiB,cpu:24GiB" \
  --local-files-only

Result: parsed 3 detections.

python examples/ovd/infer_ovd.py \
  --image examples/ovd/assets/desk.png \
  --labels "keyboard,white cup,laptop" \
  --output-dir outputs/ovd_desk \
  --max-memory "cuda:7GiB,cpu:24GiB" \
  --local-files-only

Result: parsed 3 detections.

python examples/ovd/build_gallery.py \
  --case "Person=outputs/ovd_person" \
  --case "Drinks/Fruit=outputs/ovd_drinks_fruit" \
  --case "Desk=outputs/ovd_desk"

Result: wrote outputs/ovd_gallery/index.html.

Checklist

Added an additive example without changing training, evaluation, or model-loading defaults.
Included local inference artifacts for raw output, parsed detections, annotated image, and HTML report generation.
Tested single-label, multi-label, and multi-object OVD examples locally.
Tested the gallery builder on multiple generated output directories.
Kept generated outputs/ files out of the committed changes.

Add OVD local inference example

bc96924

Travor278 marked this pull request as ready for review May 12, 2026 01:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OVD local inference example#359

Add OVD local inference example#359
Travor278 wants to merge 1 commit into
om-ai-lab:mainfrom
Travor278:add-ovd-local-inference-example

Travor278 commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Travor278 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Gallery Preview

Related Issue

Motivation and Context

How Has This Been Tested?

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Travor278 commented May 12, 2026 •

edited

Loading