StruSel is a computational pipeline for identifying pathogen-exclusive drug target candidates through proteome-scale structural comparison. It goes beyond traditional sequence-based homology exclusion by leveraging AI-predicted 3D structures to uncover proteins that are structurally divergent from the human proteome — representing a largely unexplored space for selective antimicrobial drug discovery.
Given a target pathogen, StruSel:
- Downloads AlphaFold2-predicted structures for the pathogen proteome
- Performs structurome-wide structural similarity search against the human structurome using FoldSeek (3Di-based)
- Performs parallel sequence-based search using BLASTp for benchmarking
- Identifies proteins uniquely classified as non-homologous by structural (but not sequence) analysis
- Cross-references with essential gene databases to filter for essential proteins
- Maps candidates to CARD to assess AMR relevance
- Outputs a ranked, annotated list of pathogen-exclusive target candidates
Current implementation: Klebsiella pneumoniae (WHO critical priority pathogen)
Planned extension: Pan-ESKAPE (E. faecium, S. aureus, K. pneumoniae, A. baumannii, P. aeruginosa, Enterobacter spp.)
Pathogen proteome (AlphaFold DB)
│
▼
FoldSeek structural search ──────────────────┐
│ │
BLASTp sequence search │
│ │
▼ ▼
Structure-sequence decoupling Non-homologs (structural)
│
▼
Essential gene filtering (DEG / OGEE)
│
▼
AMR mapping (CARD)
│
▼
StruSel target candidates
strusel/
├── README.md
├── notebooks/
│ └── strusel_pipeline.ipynb # Main Colab notebook
├── data/
│ └── .gitkeep # Input data (not tracked)
├── results/
│ └── .gitkeep # Output files (not tracked)
└── scripts/
└── .gitkeep # Modular scripts (coming soon)
- Python 3.8+
- FoldSeek
- BLAST+
- Biopython
- pandas, numpy
- requests (AlphaFold DB API)
If you use StruSel, please cite:
Pranavathiyani G. StruSel: Structurome-wide Selectivity of pathogen-exclusive targets. GitHub. https://github.com/pranavathiyani/strusel
Pranavathiyani G
Assistant Professor (Research), Division of Bioinformatics
SASTRA Deemed University, Thanjavur
pranavathiyani@scbt.sastra.edu
Google Scholar | ORCID
MIT License. See LICENSE for details.