Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction-level metadata in sv.Detections #1226

Closed
2 tasks done
PawelPeczek-Roboflow opened this issue May 24, 2024 · 21 comments · Fixed by roboflow/inference#574
Closed
2 tasks done

Prediction-level metadata in sv.Detections #1226

PawelPeczek-Roboflow opened this issue May 24, 2024 · 21 comments · Fixed by roboflow/inference#574
Assignees
Labels
enhancement New feature or request

Comments

@PawelPeczek-Roboflow
Copy link
Contributor

PawelPeczek-Roboflow commented May 24, 2024

Search before asking

  • I have searched the Supervision issues and found no similar feature requests.

Description

sv.Detections supports data aligned detection-major (one element for one detection), guarded by

def validate_data(data: Dict[str, Any], n: int) -> None:
    for key, value in data.items():
        if isinstance(value, list):
            if len(value) != n:
                raise ValueError(f"Length of list for key '{key}' must be {n}")
        elif isinstance(value, np.ndarray):
            if value.ndim == 1 and value.shape[0] != n:
                raise ValueError(f"Shape of np.ndarray for key '{key}' must be ({n},)")
            elif value.ndim > 1 and value.shape[0] != n:
                raise ValueError(
                    f"First dimension of np.ndarray for key '{key}' must have size {n}"
                )
        else:
            raise ValueError(f"Value for key '{key}' must be a list or np.ndarray")

Putting prediction in broader context sometimes requires information about the whole prediction. Instances when those would be needed:

  • one makes prediction on image previously cropped and wants to retain coordinates system conversion to transform bboxes and masks. To do it one only needs parent dimensions and shift in parent coordinates system. Currently if we wanted to put that into sv.Detections, we would need to broadcast that information into all detections - and (in absence of detections) this may not be possible (yet, would let more graceful sv.Detections processing downstream)

The solution that probably does not introduce breaking change would be:

@dataclass
class Detections:
    xyxy: np.ndarray
    mask: Optional[np.ndarray] = None
    confidence: Optional[np.ndarray] = None
    class_id: Optional[np.ndarray] = None
    tracker_id: Optional[np.ndarray] = None
    data: Dict[str, Union[np.ndarray, List]] = field(default_factory=dict)
    prediction_data: Dict[str, Any] = field(default_factory=dict)

# not including `prediction_data` into `__getitem__(...)` and iterations,
# but assuming direct access `detections.prediction_data[...]`

Use case

No response

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@LinasKo
Copy link
Collaborator

LinasKo commented May 24, 2024

What we'd need to figure out too, is how two Detections with different prediction_data should interact.

Right now the interactions happens in:

  1. Detections.merge - aimed to stack contents of 2x Detections and produce 1x Detections.
  2. Detections._merge_inner_detections_objects in Add Non-Maximum Merging (NMM) to Detections #500, aimed to merge everything inside - e.g. boxes of N Detections into one Detections with one large box.

For each respective case, we can follow similar implementations we have for other variables:

  1. raise when different keys are defined, merge successfully when some are missing.
    • How to do the merge itself? I don't know.
  2. Keep the prediction_data of the 'winning' detection (highest confidence detection, or if both have None - first detection)

@PawelPeczek-Roboflow
Copy link
Contributor Author

👍 yeah, haven't thought about that
I guess that keys collisions may occur, in that particular case I would just create list of values, that should be quite predictable and easy to explain default behaviour. We may also expose additional parameter - on_prediction_data_merge which would be callable given list of metadata supposed to return merged dict. That way we let people decide

@LinasKo
Copy link
Collaborator

LinasKo commented May 24, 2024

Suppose it's metadata about source image. Would merging and keeping a list of [image_1.png, image_2.png] be useful? Also, do you have some specifics in mind on what you'd use it for?

@PawelPeczek-Roboflow
Copy link
Contributor Author

I would see this issue useful in broader context, not even specifically regarding image, just regarding anything providing context information about the prediction.
Merging makes sense (for instance if u keep track of "parent_ids" - then list of them is useful after merge).

Specific use-cases:
https://github.com/roboflow/inference/blob/d6d38f18d7a1209f9e5d176b1362650fa4999cd1/inference/core/workflows/core_steps/common/utils.py

@SkalskiP
Copy link
Collaborator

Hi @LinasKo and @PawelPeczek-Roboflow 👋🏻

  • do we need a separate field?
data: Dict[str, Union[np.ndarray, List]] = field(default_factory=dict)
prediction_data: Dict[str, Any] = field(default_factory=dict)
  • mechanisms that this change may impact:
    • sv.Detections.__getitem__
    • sv.Detections.__setitem__
    • sv.Detections.merge - I honestly don't like the idea of creating List[T] while we merge two sv.Detections with T extra data. The end user may expect that field to be of T type. Especially since there are a lot of API functions that internally run merge and users don't even know about it.
    • upcoming Add Non-Maximum Merging (NMM) to Detections #500 - what do we do when we literally merge two sv.Detections objects? @LinasKo
    • sv.CSVSink and sv.JSONSink. will those new values be saved into a CSV or JSON file?

@PawelPeczek-Roboflow
Copy link
Contributor Author

prediction_data could serve the role of container for data describing all detections, when data will remain detection-level.

Initially I thought we should not touch sv.Detections.__getitem__ and sv.Detections.__setitem__, as that would be very natural to operate on prediction_data directly, and that would probably not have a chance to introduce ambiguity in operations.

sv.Detections.merge - ok, so let's require resolver function whenever prediction_data is not empty

regarding sinks - as I believe they are supposed to dump detections one-by-one - we may let people decide if they want to broadcast metadata, conflicts with data keys may be resolved by adding prefix. This is the problem (just the opposite-way-around) that we are facing while having prediction-level metadata and broadcasting them into detection-level

@LinasKo
Copy link
Collaborator

LinasKo commented May 24, 2024

Re: literal merging & data: we previously spoke that users may add anything they wish into data, and that we should drop non-winning data. E.g. we can't easily merge two distinct class names into one. Same approach would apply here - we'd keep prediction_data of winning Detections.

@PawelPeczek-Roboflow
Copy link
Contributor Author

happy for that to be the case, as long as user can inject callable changing that behaviour

@SkalskiP
Copy link
Collaborator

sv.DetectionsSmoother.update_with_detections is example of call that will silently convert T type metadata into List[T]

@PawelPeczek-Roboflow
Copy link
Contributor Author

for this case more reasonable thing to do is probably override, or letting people decide what to do

@SkalskiP
Copy link
Collaborator

I think in places like sv.DetectionsSmoother.update_with_detections, we just need to come up with reasonable behavior. I don't want to introduce more parameters into that API.

@PawelPeczek-Roboflow
Copy link
Contributor Author

defaults should be there of-course, but I feel like not providing a way to alter the default behaviour as limiter

@SkalskiP
Copy link
Collaborator

If someone has a sv.Detections object and passes it through DetectionsSmoother.update_with_detections, they don't want that data to be mutated. They want their boxes to stop flickering.

@SkalskiP
Copy link
Collaborator

@PawelPeczek-Roboflow, how urgent is it? Do you want us to implement is? Or you or Grzegorz will take care of creating PR?

@PawelPeczek-Roboflow
Copy link
Contributor Author

PawelPeczek-Roboflow commented May 24, 2024

well, yes but we cannot predict what the use-cases would be for metadata - hence advocating for exposing a knob for the user
I am not particularly concern about this particular method, we can add knob later if that's needed there.
This is not urgent, we've solved this problem in some way that is fine for the time being - would just be good to have this supported in sv some time in the future.
We can take that, letting it to decide who would implement changes later, just want to reach conclusion if that change is ok and have high-level plan of implementation

@SkalskiP
Copy link
Collaborator

It is 100% okay. Most of my concerns come down to implementation details.

@PawelPeczek-Roboflow
Copy link
Contributor Author

ok, so lets resolve as we go

@LinasKo
Copy link
Collaborator

LinasKo commented Aug 15, 2024

Hey @PawelPeczek-Roboflow, what was the outcome? Are you using Detections.data and storing a copy of the metadata for each detected object? Or do you no longer need it?

@PawelPeczek-Roboflow
Copy link
Contributor Author

not sure why it got closed tbh

@LinasKo
Copy link
Collaborator

LinasKo commented Nov 4, 2024

Implemented via #1589

@PawelPeczek-Roboflow, you still need it, you may use it like so:
detections.metadata["camera_id"] = 42.

Caveats:

  • Merging 2x empty detections with metadata does not preserve it.
  • metadata is not yet saved in CSVSink and JSONSink

@LinasKo LinasKo closed this as completed Nov 4, 2024
@PawelPeczek-Roboflow
Copy link
Contributor Author

cool thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants