-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create new cookbook for utilizing supervision methods to easily create YOLO datasets for training #1388
Comments
Cześć @xaristeidou 👋🏻 To be honest, I would really prefer not to treat YOLO differently than other data formats. The Supervision API aims to provide reusable building blocks like |
@SkalskiP That is a fact, I was thinking about that it is "too much" automation. Maybe I could create a cookbook similar to 'Serialise Detections to a CSV File' and 'Serialise Detections to a JSON File', guiding and combining the aforementioned methods to construct a YOLO dataset easily. |
I think the cookbook makes a lot more sense. We also released this how-to guide last week. MAybe you could reuse some of those code snippet in your cookbook? |
Yeah sure! |
sv.DetectionDataset
which will create/export proper/final structure of YOLO dataset for training
@xaristeidou should I expect cookbook PR? ;) |
@SkalskiP Yes, but not immediately, i don't have it ready now. I will work for it mostly this weekend. |
Search before asking
Description
I find myself creating dataset structures and split in train, valid, test and images, labels folders multiple times. The whole process could easily be automated.
Use case
There are methods that currently exist to load a dataset
sv.DetectionDataset.from_yolo()
, split in selected ratiosv.DetectionDataset.split()
and export to YOLO formatsv.DetectionDataset.to_yolo()
.Nevertheless, in creation of YOLO model training dataset structure, someone must write a custom split in train/valid/test (because
split()
is able to split only in two parts), and also create manually the train/valid/test folders needed for the preparation of the dataset. (As of my knowledge ultralytics YOLO models require by default to have train/valid folders that contain valid and not empty annotations, test folder can be empty).For that reason I propose a new method to be added in
sv.DetectionDataset
which will combine the arguments offrom_yolo(), split(), to_yolo()
and will run the whole backend for creating train/valid/test folder and images/labels subfolders along with data.yaml file.At this current point I have developed an implementation of such method which provides the ability to the user to create a YOLO dataset structure with a single line of code. An example of executing such a process can be seen in the following example:
Additional
test_ratio
is automatically calculated based on thetrain_ratio, valid_ratio
.from_yolo(), split(), to_yolo()
.Let me know if you like this idea, and if you want to submit a PR with the initial implementation.
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: