Caution
You probably SHOULD NOT use this repository directly for the EEG2025 Challenge. All available datasets have already been processed using this pipeline. This repository is provided for transparency and reproducibility. If you want to process your own BIDS datasets, feel free to use it.
A comprehensive pipeline for processing, resampling, and converting EEG-BIDS datasets. Designed for the HBN (Healthy Brain Network) EEG datasets and the EEG2025 NeurIPS Challenge.
resample_dataset/
β
βββ resampling/ # EEG resampling and filtering tools
β βββ process_eeg_data.m # MATLAB script for filtering & resampling
β βββ process_metadata.py # Update BIDS metadata after resampling
β
βββ format_conversion/ # Format conversion tools
β βββ convert_set_to_bdf.py # Convert EEGLAB SET β BDF format
β βββ convert_set_to_edf.py # Convert EEGLAB SET β EDF format
β βββ complete_bids_datasets.py # Maintain BIDS structure after conversion
β
βββ utils/ # Utility and validation tools
β βββ compare_signal_formats.py # Compare signals between formats
β βββ fix_events_duration.py # Fix event duration issues
β βββ signal_comparison_results/ # Comparison visualizations
β βββ SIGNAL_COMPARISON_SUMMARY.md # Detailed comparison report
β
βββ run_eeg_processing.sh # Main script for resampling pipeline
βββ run_bdf_conversion.sh # Main script for BDF conversion
βββ LICENSE
βββ README.md # This file
./run_eeg_processing.sh /path/to/input/bids /path/to/output/resampled
./run_bdf_conversion.sh /path/to/input/bids /path/to/output/bdf
- Downsamples EEG data from 500 Hz to 100 Hz
- Applies bandpass filter (0.5-50 Hz) to remove artifacts
- Validates and cleans event markers
- Updates BIDS metadata accordingly
- Converts EEGLAB SET files to:
- BDF (BioSemi Data Format)
- EDF (European Data Format)
- Preserves all signal information and metadata
- Maintains complete BIDS structure
- Compare signals between different formats
- Generate correlation plots and difference visualizations
- Produce detailed comparison reports
- MATLAB with EEGLAB toolbox installed
- Python 3.8+
- Bash shell (macOS/Linux)
- Python 3.8+
git clone https://github.com/eeg2025/downsample-datasets.git
cd downsample-datasets
pip install -r requirements.txt
This will install:
- pandas - Data manipulation
- numpy - Numerical computing
- emgio - EEG format conversion (from GitHub)
- matplotlib - Visualization
- scipy - Signal processing
The resampling pipeline processes an EEG-BIDS dataset to reduce sampling rate and apply filters:
./run_eeg_processing.sh <INPUT_DIR> <OUTPUT_DIR>
What it does:
-
EEG Data (.set files):
- Applies bandpass filter: 0.5-50 Hz
- Resamples: 500 Hz β 100 Hz
- Validates and cleans events
-
JSON Metadata:
- Updates
SamplingFrequency
from 500 to 100
- Updates
-
Events Files:
- Removes
sample
column (tied to original 500 Hz sampling)
- Removes
-
Other Files:
- Copies unchanged to maintain BIDS structure
Convert EEGLAB SET files to BDF or EDF format:
# For BDF conversion
./run_bdf_conversion.sh <INPUT_DIR> <OUTPUT_DIR>
# For EDF conversion (create your own script using convert_set_to_edf.py)
python3 format_conversion/convert_set_to_edf.py <INPUT_DIR> <OUTPUT_DIR>
What it does:
- Converts all SET files to the specified format
- Maintains complete BIDS directory structure
- Updates metadata to reflect format change
- Generates conversion reports
Validate conversions by comparing signals:
python3 utils/compare_signal_formats.py <ORIGINAL_DIR> <CONVERTED_DIR>
This generates:
- Correlation plots for each file
- Signal difference visualizations
- Summary statistics report
/path/to/input/
βββ dataset_description.json
βββ participants.tsv
βββ sub-*/
β βββ eeg/
β βββ *.set (EEG data files)
β βββ *.fdt (EEGLAB data files)
β βββ *_eeg.json (metadata)
β βββ *_events.tsv (event files)
β βββ other BIDS files
/path/to/output/
βββ dataset_description.json (updated)
βββ participants.tsv (copied)
βββ sub-*/
β βββ eeg/
β βββ *.set/.bdf/.edf (processed/converted)
β βββ *.fdt (if SET format)
β βββ *_eeg.json (updated metadata)
β βββ *_events.tsv (processed)
β βββ other BIDS files (copied)
- Resampling: ~1-2 minutes per subject
- Format Conversion: ~30 seconds per file
- Resume Capability: Skips already processed files
- Error Handling: Individual file errors don't stop the pipeline
Error: EEGLAB not found. Please add EEGLAB to your MATLAB path.
Solution: In MATLAB, run:
addpath('/path/to/eeglab')
eeglab % Initialize EEGLAB
Warning: Required Python packages not found.
Solution: Install dependencies:
pip install -r requirements.txt
Error: MATLAB not found. Please ensure MATLAB is in your PATH.
Solution: Add MATLAB to PATH or use full path:
export PATH="/Applications/MATLAB_R2023a.app/bin:$PATH"
- Original data is never modified
- Output preserves complete BIDS structure
- Processing can be resumed if interrupted
- Events are validated and cleaned during resampling
- All conversions maintain signal fidelity
Feel free to submit issues, fork the repository, and create pull requests for any improvements.
See LICENSE file for details.
Developed for processing HBN-EEG datasets for the EEG2025 NeurIPS Challenge.
For more information about BIDS format: https://bids.neuroimaging.io/ For EEGLAB documentation: https://eeglab.org/