-
Notifications
You must be signed in to change notification settings - Fork 518
Description
Hi, thank you for this amazing work!
I have a question about the object transformation from canonical to camera space. I have an indoor scene image (from HyperSim) which looks like this:

The results I got from the online demo is pretty good (shown in the image below)

However, when I tested the released code and followed demo_multi_object.ipynb, the result (rendered GIF) looks very differently to the online demo.

Also, when I transform the output glb mesh with the estimated scale, rotation, and translation, the estimation doesn't seem to be accurate (visualized in blender, figure below).

I think it's reasonable when the object is cropped (ex. the towel near the camera, the metal ring), but I don't understand why objects that are complete (ex. the brown box, duck, red bottle, bathtub) have weird spatial relationships. Is there anything that I'm doing wrong.
Again, thanks for your work, and I'll be grateful if you can provide any suggestions.