Skip to content

A full-stack web scraper for Flipkart products with Python FastAPI backend and Next.js frontend. Features real-time product extraction, advanced search/filtering, interactive UI, and MySQL database integration.

License

Notifications You must be signed in to change notification settings

zaidkx7/flipkart-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›’ Flipkart Product Scraper

A full-stack web application that scrapes product data from Flipkart and presents it through a modern, responsive web interface. Built with Python FastAPI backend and Next.js React frontend.

Project Status Python Next.js TypeScript License

πŸ“‹ Table of Contents

✨ Features

πŸ•·οΈ Web Scraping

  • Automated Flipkart scraping with pagination support
  • Real-time product data extraction (title, price, rating, specifications)
  • Intelligent retry mechanisms for robust data collection
  • Duplicate detection to prevent data redundancy
  • Session management with cookie handling

πŸ—„οΈ Backend API

  • RESTful API built with FastAPI
  • MySQL database integration with SQLAlchemy ORM
  • Advanced search functionality with fuzzy matching
  • Multiple filtering options (price, rating, category, brand)
  • Statistical analytics and trending algorithms
  • Comprehensive error handling and logging

🎨 Frontend Interface

  • Modern React 18+ UI with Next.js 15
  • Responsive design with Tailwind CSS
  • Advanced product browsing with filters and search
  • Interactive image galleries with navigation
  • Shopping cart functionality with local storage
  • Real-time data caching for optimal performance

πŸ› οΈ Tech Stack

Backend

  • Python 3.8+ - Core language
  • FastAPI - Modern web framework
  • SQLAlchemy - Database ORM
  • MySQL - Primary database
  • BeautifulSoup4 - HTML parsing
  • curl-cffi - HTTP requests with CF bypass
  • Pydantic - Data validation

Frontend

  • Next.js 15 - React framework
  • React 19 - UI library
  • TypeScript - Type safety
  • Tailwind CSS - Styling framework
  • Radix UI - Component primitives
  • Framer Motion - Animations
  • Axios - HTTP client
  • Lucide React - Icon library

Development Tools

  • ESLint - Code linting
  • PostCSS - CSS processing
  • Git - Version control

πŸ“ Project Structure

Flipkart/
β”œβ”€β”€ backend/                    # Python FastAPI backend
β”‚   β”œβ”€β”€ alchemy/               # Database models and connections
β”‚   β”‚   β”œβ”€β”€ create_tables.py   # Database schema setup
β”‚   β”‚   β”œβ”€β”€ database.py        # Database connection and queries
β”‚   β”‚   β”œβ”€β”€ models.py          # SQLAlchemy models
β”‚   β”‚   └── schemas.py         # Pydantic schemas
β”‚   β”œβ”€β”€ api/                   # FastAPI application
β”‚   β”‚   β”œβ”€β”€ main.py           # Application entry point
β”‚   β”‚   β”œβ”€β”€ routers/          # API route handlers
β”‚   β”œβ”€β”€ modules/              # Scraping modules
β”‚   β”‚   └── flipkart/        # Flipkart-specific scraper
β”‚   β”œβ”€β”€ settings/            # Configuration files
β”‚   β”œβ”€β”€ utils/              # Utility functions
β”‚   └── requirements.txt    # Python dependencies
β”‚
β”œβ”€β”€ frontend/                 # Next.js React frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ app/             # Next.js app directory
β”‚   β”‚   β”œβ”€β”€ api/             # API integration layer
β”‚   β”‚   β”œβ”€β”€ components/      # React components
β”‚   β”‚   β”‚   β”œβ”€β”€ ui/         # Reusable UI components
β”‚   β”‚   β”‚   └── Storefront.tsx # Main product interface
β”‚   β”‚   β”œβ”€β”€ hooks/          # Custom React hooks
β”‚   β”‚   └── lib/           # Utility functions
β”‚   β”œβ”€β”€ public/            # Static assets
β”‚   β”œβ”€β”€ package.json       # Node.js dependencies
β”‚   └── tailwind.config.js # Styling configuration
β”‚
β”œβ”€β”€ .gitignore            # Git ignore rules
└── README.md           # Project documentation

πŸš€ Quick Start Guide

Follow these steps to get the project running on your local machine.

1. Prerequisites

Ensure you have the following installed:

2. Clone the Repository

git clone https://github.com/zaidkx7/flipkart-scraper.git
cd flipkart-scraper

3. Backend Setup (FastAPI)

  1. Navigate to the backend directory:

    cd backend
  2. Create and Activate Virtual Environment:

    # Windows
    python -m venv venv
    venv\Scripts\activate
    
    # macOS/Linux
    python3 -m venv venv
    source venv/bin/activate
  3. Install Dependencies:

    pip install -r requirements.txt
  4. Configure Environment Variables: Create a file named .env in the backend folder and add your database credentials:

    # backend/.env
    MYSQL_HOST=localhost
    MYSQL_USER=root
    MYSQL_PASSWORD=your_password
    MYSQL_DB=flipkart
  5. Database Setup: First, make sure your MySQL server is running and create the database:

    -- Run this in your MySQL client
    CREATE DATABASE flipkart;

    Then, create the tables:

    # From the backend directory
    python alchemy/create_tables.py
  6. Start the Backend Server:

    python -m uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload
    
    # or
    
    python api/main.py

    Server will start at http://localhost:8000.

4. Frontend Setup (Next.js)

  1. Navigate to the frontend directory: Open a new terminal and run:

    cd frontend
  2. Install Dependencies:

    npm install
    
    # if this doesn't work try this
    npm install --force
  3. Configure Environment Variables: Create a file named .env.local in the frontend folder:

    # frontend/.env.local
    NEXT_PUBLIC_API_BASE_URL=http://localhost:8000
  4. Start the Frontend Development Server:

    npm run dev

    The app will be available at http://localhost:3000.

The API will be available at: http://localhost:8000

  • API Documentation: http://localhost:8000/docs
  • Alternative docs: http://localhost:8000/redoc

Run the Scraper

cd backend
python modules/flipkart/main.py

🎨 Frontend Features

πŸ” Advanced Search & Filtering

  • Real-time search with debouncing
  • Multi-criteria filtering (brand, price, rating, specs)
  • Dynamic price range sliders with Indian currency formatting
  • Category-based browsing

πŸ–ΌοΈ Interactive Product Gallery

  • Scrollable image carousel with left/right navigation
  • Thumbnail gallery for quick image selection
  • Keyboard navigation support (arrow keys)
  • Image zoom and full-screen view

πŸ›’ Shopping Cart

  • Add/remove products with quantity management
  • Persistent cart using localStorage
  • Price calculations with tax and shipping
  • Checkout flow with form validation

πŸ“± Responsive Design

  • Mobile-first approach with Tailwind CSS
  • Adaptive layouts for all screen sizes
  • Touch-friendly interactions
  • Fast loading with optimized images

⚑ Performance Features

  • Intelligent caching with 5-minute TTL
  • Client-side data persistence
  • Optimized API calls with fallback mechanisms
  • Loading states and error boundaries

🀝 Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

Contribution Guidelines

  • Follow existing code style and formatting
  • Add tests for new features
  • Update documentation as needed
  • Ensure all tests pass before submitting

⚠️ Disclaimer

This project is for educational purposes only. Please ensure you comply with:

  • Flipkart's Terms of Service
  • Robots.txt guidelines
  • Rate limiting best practices
  • Local laws and regulations

Always respect website policies and implement appropriate delays between requests.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

If you encounter any issues or have questions:

  1. Check the Issues page
  2. Create a new issue with detailed information
  3. Join our discussions in the repository

πŸ™ Acknowledgments

  • FastAPI for the excellent Python web framework
  • Next.js for the powerful React framework
  • Tailwind CSS for the utility-first CSS framework
  • Radix UI for accessible component primitives
  • Flipkart for providing the data source

⭐ Star this repository if you find it helpful!

Made with ❀️ by Muhammad Zaid

About

A full-stack web scraper for Flipkart products with Python FastAPI backend and Next.js frontend. Features real-time product extraction, advanced search/filtering, interactive UI, and MySQL database integration.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published