Skip to content

leon07c/projec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CORD-19 Data Explorer

Beginner-friendly project for exploring the CORD-19 metadata.csv dataset and creating an interactive Streamlit app.


πŸ“‚ Project Structure

Framework_Assignment/
β”œβ”€ metadata.csv           # dataset (downloaded separately)
β”œβ”€ notebook.ipynb         # Jupyter notebook with step-by-step analysis
β”œβ”€ app.py                 # Streamlit app
β”œβ”€ requirements.txt       # dependencies
└─ README.md              # this file

βš™οΈ Installation

  1. Clone the repository or download it.
git clone <your-repo-url>
cd Framework_Assignment
  1. (Optional) Create a virtual environment:
python -m venv venv
source venv/bin/activate   # macOS/Linux
venv\Scripts\activate      # Windows
  1. Install required packages:
pip install -r requirements.txt

πŸ“Š Jupyter Notebook

  1. Open Jupyter Notebook:
jupyter notebook
  1. Run through notebook.ipynb step by step to:
    • Load and clean the metadata
    • Explore basic statistics
    • Create visualizations

🌐 Streamlit App

  1. Make sure you have metadata.csv in the same folder as app.py.

  2. Run the app:

streamlit run app.py
  1. The app will open in your browser. Use the sidebar to filter year range and number of rows loaded.

βœ… Expected Outcomes

  • A Jupyter Notebook showing:
    • Basic exploration of the dataset
    • Data cleaning steps
    • Visualizations (publications by year, top journals, word frequencies)
  • A working Streamlit app to interactively explore results

πŸ“ Notes

  • If the dataset is too large, load a sample with nrows=10000.
  • Word cloud is optional and requires the wordcloud package.
  • Push this repo to GitHub as Framework_Assignment and submit the repo URL for your assignment.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published