GitHub - eri24816/music-data-analysis

This package contains a pipeline for analyzing/processing a set of midi files and building a dataset from the analysis.

It also provides a compact interface for accessing such a dataset.

Analysis pipeline

The analysis pipeline requires a source directory containing MIDI files as input and outputs a dataset directory containing analysis results.

Before running the pipeline, you need to:

Install this package (for local installation, pip install -e .)
Prepare a soundfont file on your disk
Prepare the source folder

Run the pipeline:

python run.py --path <path_to_dataset_dir> --src <path_to_src_dir> --verbose\
 -sape\
 --soundfont <path_to_soundfont>

The pipeline provides 4 actions:

--sync, -s: Sync the notes to the beat
- Input: midi
- Output: synced_midi
--align, -a: Align the MIDI files to C major/A minor
- Input: midi
- Output: synced_midi (same as -s. If -a is used together with -s, their effects will both be applied to synced_midi)
--to_pianoroll, -p: Convert the processed MIDI files to pianoroll format
- Input: synced_midi
- Output: pianoroll
--extract_features, -e: Extract features from the pianorolls
- Input: pianoroll
- Output: many

Combinations of actions:

-sape full pipeline
-spe full pipeline without alignment
-sa outputs synced and aligned midi. No conversion to pianorolls

Dataset interface

In the dataset directory, each subdirectory contains one attribute of all pieces.

dataset_dir/
    midi/
        piece1.mid
        piece2.mid
    pianoroll/
        piece1.npz
        piece2.npz
    key/
        piece1.json
        piece2.json
    density/
        piece1.json
        piece2.json
    ...

The music_data_analysis.Dataset class provides an interface for accessing attributes of the pieces.

from music_data_analysis import Dataset

dataset = Dataset('path_to_dataset')
songs = dataset.songs() # list[music_data_analysis.Song]

# Get midi of the first song
midi = songs[0].read_midi("synced_midi") # miditoolkit.MidiFile

# Get pianoroll of the first song
pianoroll = songs[0].read_pianoroll("pianoroll") # music_data_analysis.Pianoroll

# Get key of the first song
key = songs[0].read_json("key") # dict

# Get note density of the first song
density = songs[0].read_json("density") # list[float], one value per bar

# Get song by name
song = dataset.get_song("piece1") # music_data_analysis.Song

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.vscode		.vscode
music_data_analysis		music_data_analysis
test		test
.gitignore		.gitignore
manifest.py		manifest.py
pyproject.toml		pyproject.toml
readme.md		readme.md
run.py		run.py
run_one.py		run_one.py
skyline.py		skyline.py
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Analysis pipeline

Dataset interface

About

Uh oh!

Releases

Packages

Uh oh!

Languages

eri24816/music-data-analysis

Folders and files

Latest commit

History

Repository files navigation

Analysis pipeline

Dataset interface

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages