This repository predicts Remaining Useful Life (RUL) for bearings using classic machine learning and deep learning workflows on the IMS Rexnord dataset. It includes notebooks for data import, EDA, feature engineering and RUL labeling, plus a Streamlit app for interactive training/evaluation and a BentoML server for API inference.
- Classic ML: SVR, RandomForest, LightGBM trained on engineered time-domain features.
- Deep Learning: 1D-CNN on raw waveforms and 2D-CNN on spectrograms (prepared in notebooks).
- Apps: Streamlit UI for training/evaluation and a BentoML service endpoint for RUL predictions.
dataset_import.ipynb: Imports raw ASCII files fromdata/into structured frames.notebooks/EDA.ipynb: Exploratory Data Analysis of the bearing signals and features.notebooks/RUL.ipynb: Time-feature extraction, RUL label creation, cleaning, splits; optional CNN dataset prep.notebooks/classic_ML.ipynb: SVR and tree baselines on preprocessed features with diagnostics and comparisons.data/: Raw IMS-bearing test data (ASCII snapshots).processed_data/: Time-feature CSVs.1D_CNN_arrays/and2D_CNN_arrays: Optional.npyartifacts for CNNs.classic_ml_datasets/: Train/Val/Test CSV splits per set for classic ML.app/streamlit_app.py: Interactive app to train/evaluate SVR vs RF and visualize predictions.app/bento_service.py: BentoML service definition for API-based RUL inference.models/: Saved models (Joblib/Pickle) and BentoML model tags.
- Open
dataset_import.ipynband run the cells to parse raw files underdata/. - Raw files are timestamped (e.g.,
2003.10.22.12.06.24) with tab-separated channels. - Outputs: cleaned frames saved under
processed_data/for downstream notebooks.
-
Open
notebooks/EDA.ipynbto:- Inspect distributions, correlations and per-bearing trends.
- Visualize time-series snapshots and channel-specific behavior.
-
Open
notebooks/RUL.ipynbto:- Compute time-domain features: mean, std, skew, kurtosis, entropy, RMS, max, peak-to-peak.
- Create
RUL_minuteslabels using timestamp distance to the last sample (failure time), optional capping/normalization. - Handle missing values and build chronological Train/Val/Test splits.
- Save outputs to
processed_data/andSVR_datasets/. - Optional: prepare 1D-CNN waveform arrays
(B, C, N)and 2D spectrogram arrays(B, C, H, W)and save as.npy.
- Open
notebooks/classic_ML.ipynband run:- Dataset loader reads and concatenates
set{1,2,3}_timefeatures_{train,val,test}.csvfromSVR_datasets/. - Trains SVR with scaling and RandomForest baselines; optionally LightGBM.
- Prints RMSE/MAE on validation and test; compares against a naive-mean baseline.
- Diagnostics: RMSE as % of mean RUL, improvement vs naive, generalization gap.
- Saves the best model under
models/(Joblib/Pickle) and can export BentoML models.
- Dataset loader reads and concatenates
Interactive training and evaluation (SVR vs RF), plots, and model saving.
- Start the app:
streamlit run app/streamlit_app.py- The app loads the combined Train/Val/Test splits from
classic_ml_datasetsorSVR_datasets(configure inside the app code). - Tune hyperparameters, compare metrics, visualize predictions vs ground truth, and save the best model.
A service for programmatic RUL predictions with the saved model.
- Check
app/bento_service.pyfor your service definition and model loading. - Import the bentomodels to use them.
- Start the service (example syntax; confirm service name in the file):
bentoml models import bentomodels/rul_predictor_1d_cnn.bentomodel
bentoml models import bentomodels/rul_predictor_2d_cnn.bentomodel
bentoml serve app.bento_service:IndustrialService --reload- Example request (from the Streamlit app or curl):
curl -X POST http://localhost:3000/predict \
-H 'Content-Type: application/json' \
-d '{"instances": [{"B1mean": 0.12, "B1rms": 0.10, "B1skew": 0.0, "B1kurtosis": 1.2}]}'Setup a Python environment and install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install numpy pandas scipy scikit-learn matplotlib plotly streamlit bentoml joblib- Run
notebooks/RUL.ipynbto generate features and RUL labels, and create Train/Val/Test splits. - Open
notebooks/classic_ML.ipynbto train and compare models; save the best model. - Launch Streamlit (
streamlit run app/streamlit_app.py) to interactively evaluate and save models. - Optionally start the BentoML service to serve predictions via REST.
- Use chronological splits or leave-one-run-out to avoid leakage.
- Monitor RMSE/MAE in minutes and interpret relative to mean RUL (≤15–20% is typically good).
- Prefer tree ensembles for tabular features; explore CNNs for raw/spectrogram signals.
- Save scalers/normalizers with your model for consistent inference.
This project references the IMS bearing data for research/educational use. Respect dataset licensing and attribution guidelines.