Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Bounding box coordinates returned by the pose tracker API are not scaled correctly #2805

Open
3 tasks done
wbudd opened this issue Aug 1, 2024 · 1 comment
Open
3 tasks done

Comments

@wbudd
Copy link

wbudd commented Aug 1, 2024

Checklist

  • I have searched related issues but cannot get the expected help.
  • 2. I have read the FAQ documentation but cannot get the expected help.
  • 3. The bug has not been fixed in the latest version.

Describe the bug

Pose tracking results returned include both an array of pose detections and an array of the bounding boxes within which those poses were detected.

However, even though the returned coordinates of the pose detections are scaled correctly in relation to the original input image, the same is not done for the returned bounding boxes.

Furthermore, this seems difficult to work around by the API consumer, because intermediate image sizes and/or scale factors do not seemed to be exposed by the API. In other words, the API consumer is given no clue what the bounding box coordinates need to be multiplied with to get dimensions that make sense in relation to the original input image.

Reproduction

Simply run the mmdeploy-provided pose tracker demo for Python, or any other supported language. As input image, use one or more small image(s) (640x480 or less) clearly depicting a person. The bounding box's bottom right x/y will be larger than the width/height of the entire input image.

Environment

The mmdeploy Docker image provided by mmdeploy.

mmdet model used is `rtmdet-nano-ort`.
mmpose model used is `rtmw-m-trt-fp16`.

Both with default configurations, as generated by mmdeploy in accordance with the official docs.

(I suspect this bug applies to all or most model combinations, but I've only tested the above configuration.)

Error traceback

No response

@wbudd
Copy link
Author

wbudd commented Aug 2, 2024

Looking a bit more at the mmdeploy code I see that the bounding boxes that the API returns are the result of affine transforms specified in the pose model's pipeline.json (taskstransforms"type": "TopDownAffine"), with this being the corresponding code: https://github.com/open-mmlab/mmdeploy/blob/main/csrc/mmdeploy/codebase/mmpose/topdown_affine.cpp

The problem is that those resized bounding boxes are only relevant as input preparation for the pose model, and not at all useful for API consumer who only knows about the original image size and thus should rather receive the bounding box dimensions produced as output from the object detector model. The pose tracker API should preserve this information for the final result, but fails to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant