Maze-Solver

Intelligent pathfinding system using RRT* algorithms enhanced with Q-learning.

An intelligent pathfinding system that combines RRT (Rapidly-exploring Random Tree Star)* algorithms with Q-learning reinforcement learning to navigate through maze environments. The agent learns optimal sampling strategies through experience, evolving from random exploration to intelligent goal-directed movement.

Features

Hybrid Algorithm: Combines traditional RRT* pathfinding with Q-learning reinforcement learning
Adaptive Learning: Agent learns from experience across multiple episodes with epsilon-decay exploration
Multiple Sampling Strategies:
- Random sampling for exploration
- Goal-biased sampling for directed progress
- Exploration-focused sampling for unvisited areas
- Path-optimised sampling near promising routes
Real-time Visualisation: Live turtle graphics animation showing tree growth and pathfinding process
Model Persistence: Save and load trained Q-tables for continuous learning
Path Optimisation: RRT* rewiring for shorter, more efficient routes
Performance Tracking: Training statistics including success rates and convergence metrics

Usage

When you run the program, you'll be presented with three options:

Choose an option:
1. Train new RRT* model (100 episodes)
2. Load existing RRT* model and demonstrate
3. Train new RRT* model and demonstrate

Training Mode (Option 1 or 3)

Trains the Q-learning agent for 100 episodes
Displays training progress with success rates and rewards
Saves the trained model as rrt_star_q_model.pkl

Demonstration Mode (Option 2)

Loads a pre-trained model (if available)
Visualises the pathfinding process with turtle graphics
Shows the learned optimal path in orange

Algorithm Overview

RRT* (Rapidly-exploring Random Tree Star)

Tree Growth: Builds a tree of connected nodes from start toward goal
Rewiring: Continuously optimises paths for shorter routes
Collision Detection: Ensures paths avoid obstacles

Q-Learning Enhancement

State Representation: Distance to goal, direction, and nearest node proximity
Action Space: Four sampling strategies (random, goal-biased, exploration, path-optimized)
Reward System:
- Progress toward goal: +10 × distance_improvement
- Exploration bonus: +1 for distant nodes
- Time penalty: -1 per iteration
- Success reward: +10 for reaching goal

Learning Process

Exploration Phase: High epsilon (ε=0.3) for random action selection
Exploitation Phase: Gradual epsilon decay (0.995) toward greedy policy
Experience Retention: Q-table persistence across training sessions

Core Classes

`QLearningAgent`

Implements Q-learning with epsilon-greedy policy
State feature extraction and action selection
Q-value updates using temporal difference learning

`Maze`

20×20 grid environment with walls and free spaces
Turtle graphics visualisation
Collision detection and coordinate transformations

`SmartRRTStar`

Enhanced RRT* with Q-learning integration
Adaptive sampling based on learned policies
Tree rewiring for path optimisation

`RRTTrainer`

Training orchestration and statistics tracking
Model persistence (save/load functionality)
Demonstration mode with visualisation

Controls & Visualisation

Green: Start position
Red: Goal position
Black: Walls/obstacles
Blue: RRT* tree edges
Yellow: Current path to goal
Orange: Final learned optimal path

Customisation

Maze Configuration

Modify the TILES array to create custom maze layouts:

TILES = [
    0, 0, 0, 0, 0,  # 0 = wall
    0, 1, 1, 1, 0,  # 1 = free space
    0, 1, 0, 1, 0,
    0, 1, 1, 1, 0,
    0, 0, 0, 0, 0,
]

Hyperparameter Tuning

Adjust learning parameters in QLearningAgent:

learning_rate: Q-value update rate (default: 0.1)
discount_factor: Future reward importance (default: 0.95)
epsilon: Exploration rate (default: 0.3)
epsilon_decay: Exploration decay (default: 0.995)

Algorithm Parameters

Modify RRT* behavior in SmartRRTStar:

search_radius: Rewiring neighborhood size (default: 3)
max_iterations: Maximum tree growth steps (default: 1000)
step_size: Node expansion distance (default: 1)

Known Issues

Large mazes may require increased max_iterations
Turtle graphics can be slow for real-time visualisation
Model convergence depends on maze complexity

References

RRT* Algorithm - Karaman & Frazzoli
Q-Learning - Watkins & Dayan
Reinforcement Learning: An Introduction - Sutton & Barto

License

This project is licensed under the MIT License.

Future Enhancements

Multi-goal pathfinding
Dynamic obstacle environments
Deep Q-Networks (DQN) implementation
3D maze environments
Comparative algorithm benchmarks
Web-based visualization interface

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
RRT+APT.py		RRT+APT.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Maze-Solver

Features

Usage

Training Mode (Option 1 or 3)

Demonstration Mode (Option 2)

Algorithm Overview

RRT* (Rapidly-exploring Random Tree Star)

Q-Learning Enhancement

Learning Process

Core Classes

`QLearningAgent`

`Maze`

`SmartRRTStar`

`RRTTrainer`

Controls & Visualisation

Customisation

Maze Configuration

Hyperparameter Tuning

Algorithm Parameters

Known Issues

References

License

Future Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Maze-Solver

Features

Usage

Training Mode (Option 1 or 3)

Demonstration Mode (Option 2)

Algorithm Overview

RRT* (Rapidly-exploring Random Tree Star)

Q-Learning Enhancement

Learning Process

Core Classes

QLearningAgent

Maze

SmartRRTStar

RRTTrainer

Controls & Visualisation

Customisation

Maze Configuration

Hyperparameter Tuning

Algorithm Parameters

Known Issues

References

License

Future Enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`QLearningAgent`

`Maze`

`SmartRRTStar`

`RRTTrainer`

Packages