Add OVD local inference example#359
Open
Travor278 wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds a small
examples/ovdgallery for running the released VLM-R1 OVD checkpoint on local images.The example includes:
omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321<answer>JSON, Markdown-fenced JSON, and optionaljson_repairdetections.json,raw_output.txt,prompt.txt, and a single-case HTML reportThis is additive only and does not change training, evaluation, or model-loading defaults elsewhere in the repository.
Gallery Preview
The gallery below was generated locally with
examples/ovd/build_gallery.pyfrom multiple OVD output directories. The checked-in builder supports the same switchable layout for any number of--caseentries.Related Issue
Closes #200.
Related to #232 and #306 because the example supports multiple bounding boxes and comma-separated target labels. It may also help users debugging OVD demo setup issues such as #297, but it does not change the hosted demo or add OVDEval evaluation templates.
Motivation and Context
Several users asked for a runnable OVD inference path and the exact prompt shape needed to get bounding-box JSON from the released OVD model. Pointing users only to the hosted Space makes local debugging harder, especially when they need to inspect raw model output, parsed boxes, and the rendered result.
This example keeps the surface small: one local image in, annotated output and JSON artifacts out.
How Has This Been Tested?
Local environment:
Commands run:
python examples/ovd/infer_ovd.py \ --image examples/ovd/assets/person.jpg \ --labels person \ --output-dir outputs/ovd_person \ --max-memory "cuda:7GiB,cpu:24GiB" \ --local-files-onlyResult: parsed 4 detections.
Result: parsed 3 detections.
Result: parsed 3 detections.
Result: wrote
outputs/ovd_gallery/index.html.Checklist
outputs/files out of the committed changes.