multilabel_split

Sample algorithm for stratified train/test split in multi-label problems

Mostly for educational purposes - as some parts of the algorithm would need to be replaced in a real-world setting.

There are several papers online describing similar approaches, but sadly no free implentation for large-scale datasets - to the best of my knowledge.

This specific implementation was (mostly) inpired by the following work:

Sechidis, K.,Tsoumakas, G., Vlahavas, I.: On the stratification of multi-label data. Machine Learning and Knowledge Discovery in Databases pp. 145-158 (2011) available: http://lpis.csd.auth.gr/publications/sechidis-ecmlpkdd-2011.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
multilabel_split.py		multilabel_split.py

Provide feedback