Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-6381 - Add optional label renaming to detection transform camera #3538

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions components/camera/transformpipeline/detector.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,10 @@ import (

// detectorConfig is the attribute struct for detectors (their name as found in the vision service).
type detectorConfig struct {
DetectorName string `json:"detector_name"`
ConfidenceThreshold float64 `json:"confidence_threshold"`
ValidLabels []string `json:"valid_labels"`
DetectorName string `json:"detector_name"`
ConfidenceThreshold float64 `json:"confidence_threshold"`
ValidLabels []string `json:"valid_labels"`
LabelRenamer map[string]string `json:"label_renamer,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by making the field an agent noun, it makes it seem to me like the config requires a function- I would prefer a name like "rename_labels"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay done.

}

// detectorSource takes an image from the camera, and overlays the detections from the detector.
Expand All @@ -31,6 +32,7 @@ type detectorSource struct {
detectorName string
labelFilter objectdetection.Postprocessor // must build from ValidLabels
confFilter objectdetection.Postprocessor
labelRenamer objectdetection.Postprocessor
r robot.Robot
}

Expand Down Expand Up @@ -61,11 +63,14 @@ func newDetectionsTransform(
validLabels[strings.ToLower(l)] = struct{}{}
}
labelFilter := objectdetection.NewLabelFilter(validLabels)
labelRenamer := objectdetection.NewLabelRenamer(conf.LabelRenamer)

detector := &detectorSource{
gostream.NewEmbeddedVideoStream(source),
conf.DetectorName,
labelFilter,
confFilter,
labelRenamer,
r,
}
src, err := camera.NewVideoSourceFromReader(ctx, detector, &cameraModel, camera.ColorStream)
Expand Down Expand Up @@ -96,6 +101,7 @@ func (ds *detectorSource) Read(ctx context.Context) (image.Image, func(), error)
// overlay detections of the source image
dets = ds.confFilter(dets)
dets = ds.labelFilter(dets)
dets = ds.labelRenamer(dets)

res, err := objectdetection.Overlay(img, dets)
if err != nil {
Expand Down
19 changes: 19 additions & 0 deletions vision/objectdetection/postprocessor.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,25 @@ func NewLabelFilter(labels map[string]interface{}) Postprocessor {
}
}

// NewLabelRenamer renames the labels in the input map from the key to the value.
func NewLabelRenamer(labels map[string]string) Postprocessor {
return func(in []Detection) []Detection {
if len(labels) < 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learned today that you can call len() on a nil map and it will not panic - thanks for this

return in
}

for oldL, newL := range labels {
for i, d := range in {
Comment on lines +61 to +62
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the fact that you have to iterate over the entire map every time, but I understand due to the prefix matching that you have to. I suggest an alternative though, because I think ALWAYS mapping to a prefix can cause problems (e.g. if my detector knows about "car" or "carrot" or "carnations" it will catch all of them even if I just wanted to replace "car")

The alternative:

  • if there is a * at the end of the key in the label map, then you do a prefix match
  • fi there is no * at the end of the key in the label map, you do an exact match

When first defining NewLabelRenamer, you split the map into two maps, the exact matcher map, and the prefix matcher map. Then you don't have to iterate over all of the labels all the time.

if strings.HasPrefix(strings.ToLower(d.Label()), strings.ToLower(oldL)) {
in[i] = NewDetection(*d.BoundingBox(), d.Score(), newL)
Copy link
Member

@bhaney bhaney Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also modifies the input slice that was fed into the function, and I don't think you want to do that. You should make a new output slice and put the NewDetections in that, and then return "out".

}
}
}

return in
}
}

// SortByArea returns a function that sorts the list of detections by area (largest first).
func SortByArea() Postprocessor {
return func(in []Detection) []Detection {
Expand Down
Loading