Skip to content

laksh2005/BERT-Clinical-Multiclassifier

Repository files navigation

BERT Clinical Notes Multiclassification

This project fine-tunes Bio_Clinical BERT for multi-class classification of clinical notes using transformer-based deep learning.

Features

  • Full pipeline: JSON data loading, preprocessing, train–validation stratified split, and label encoding
  • Custom PyTorch Dataset class for integration with HuggingFace Trainer
  • Fine-tunes pretrained Bio_ClinicalBERT for 22 clinical note categories
  • Model evaluation with classification report, macro F1/accuracy, and annotated confusion matrix
  • Visualization of per-class distribution and prediction results

Checking Class Imbalance

Accuracy vs Validation

Stack Used

  • Python, pandas, numpy
  • PyTorch, HuggingFace Transformers, Datasets
  • scikit-learn (evaluation, splitting)
  • matplotlib, seaborn

About

fine tuning Bio_Clinical BERT for multi-class classification using transformer-based deep learning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published