Reinforcement Learning-Based Surge Pricing Simulation

Overview

This repository contains a Reinforcement Learning (Q-learning) based simulation for modeling surge pricing and driver allocation in a multi-route transportation network. The project simulates how drivers (agents) choose routes, how surge pricing adjusts dynamically, and how drivers learn over time to maximize their earnings.

The key objective is to analyze the emergent pricing patterns, driver distribution, and reward optimization in a dynamic transportation system.

Features

Multi-Agent Reinforcement Learning: Each driver (agent) uses Q-learning to optimize their route selection based on historical earnings.
Dynamic Surge Pricing: Fares adjust based on supply-demand imbalances, simulating real-world surge pricing.
Exploration-Exploitation Tradeoff: Uses epsilon-greedy policy to balance exploration (trying new routes) and exploitation (choosing the most profitable routes).
Data Visualization: Generates insights through plots on:
- Driver distribution across routes.
- Evolution of Q-values (profitability estimates).
- Surge pricing trends.
- Average earnings per route.

Simulation Parameters

The following parameters can be adjusted to explore different scenarios:

Parameter	Description	Default Value
`num_routes`	Number of available routes	10
`num_drivers`	Number of drivers (RL agents)	20,000
`total_riders`	Total number of riders across all routes	300
`iterations`	Number of simulation iterations	100
`base_price`	Base fare for a route	10.0
`k`	Surge sensitivity parameter	0.1
`alpha`	Learning rate for Q-value updates	0.1
`epsilon`	Exploration rate (epsilon-greedy policy)	0.1

How It Works

Initialize Q-values: Each driver starts with zero knowledge about route profitability.
Driver Route Selection: Each driver selects a route based on an epsilon-greedy strategy.
Supply and Demand Calculation: The simulation calculates the number of drivers per route.
Surge Pricing Calculation: If demand exceeds supply, fares increase using a surge multiplier.
Reward Calculation: Driver earnings are updated based on route price and competition.
Q-Learning Update: Drivers update their Q-values using the reward feedback.
Iteration & Learning: The process repeats, allowing drivers to learn profitable strategies over time.
Data Visualization: Key metrics are plotted to understand learning dynamics.

Installation & Usage

Prerequisites

Python 3.x
NumPy
Matplotlib

Installation

Clone the repository:

git clone https://github.com/viznuv/Pricing_models_dynamic.git
cd Pricing_models_dynamic

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Pricing_basic_work.ipynb		Pricing_basic_work.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Reinforcement Learning-Based Surge Pricing Simulation

Overview

Features

Simulation Parameters

How It Works

Installation & Usage

Prerequisites

Installation

About

Uh oh!

Releases

Packages

Languages

viznuv/Pricing_models_dynamic

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning-Based Surge Pricing Simulation

Overview

Features

Simulation Parameters

How It Works

Installation & Usage

Prerequisites

Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages