Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Lint

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v4
- name: Check formatting
run: uvx ruff format --check chessmate
- name: Check linting
run: uvx ruff check chessmate
- name: Install dependencies
run: uv sync --project chessmate --extra dev
- name: Type check
run: uv run --project chessmate mypy chessmate/src
18 changes: 18 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: Tests

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v4
- name: Install dependencies
run: uv sync --project chessmate --extra dev
- name: Run tests
run: uv run --project chessmate pytest chessmate/tests
11 changes: 11 additions & 0 deletions chessmate/.pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.11.2
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.14.1
hooks:
- id: mypy
19 changes: 19 additions & 0 deletions chessmate/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.PHONY: lint format typecheck test pre-commit install-hooks

lint:
uv run ruff check .

format:
uv run ruff format .

typecheck:
uv run mypy src

test:
uv run pytest tests

pre-commit:
uv run pre-commit run --all-files

install-hooks:
uv run pre-commit install
230 changes: 230 additions & 0 deletions chessmate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
# Project: Train a Neural Network to Play Chess (AlphaZero-lite)

## Overview

In this project, you will build a simplified version of an AlphaZero-style system that learns to play chess through a combination of supervised learning and self-play.

The goal is **not** to build a strong chess engine, but to:
- Develop intuition for deep learning models in structured domains
- Understand training dynamics and failure modes
- Learn the foundations of reinforcement learning via self-play
- Gain experience building end-to-end learning systems

---

## Learning Objectives

By the end of this project, you should have a working understanding of:

- Designing input representations for structured data
- Training multi-head neural networks (policy + value)
- Debugging training instability
- Monte Carlo Tree Search (MCTS)
- Self-play as a data generation mechanism
- The interaction between models and search

---

## Constraints

- ~5 hours per week
- Local GPU (e.g., RTX 3070)
- Keep implementations **simple and interpretable**
- Avoid heavy abstractions or full frameworks

---

## Recommended Tools

### Core Libraries
- `python-chess` → board representation, legal moves
- `PyTorch` → model and training

### Supporting Tools
- NumPy
- matplotlib (for tracking training)
- tqdm (for progress tracking)
- `MLflow` → experiment tracking, metric logging, model versioning

---

## Key Resources

### Papers
- *AlphaZero (DeepMind, 2017)*
Focus on:
- Training loop
- Loss function
- Self-play structure

### Reference Implementations
- Search for:
- "minimal AlphaZero implementation"
- "AlphaZero from scratch PyTorch"

⚠️ Prefer small, readable repos over production-grade systems.

---

## Project Phases

---

### Phase 1: Chess Representation

**Goal:** Convert chess positions into model-friendly inputs.

Focus on:
- Representing board state as tensors
- Encoding piece types, colors, and basic game state
- Mapping moves into a fixed action space

Questions to think about:
- How should a board be represented numerically?
- How do you handle illegal moves?

---

### Phase 2: Model Architecture

**Goal:** Build a neural network that evaluates chess positions.

Requirements:
- Input: board representation
- Outputs:
- Policy head (probability distribution over moves)
- Value head (scalar evaluation of position)

Focus on:
- Designing a simple CNN-based architecture
- Understanding how shared representations feed multiple outputs

---

### Phase 3: Supervised Pretraining (Bootstrap)

**Goal:** Train the model on real chess games.

Steps:
- Load a dataset of games (e.g., PGN format)
- Train the model to:
- Predict the next move (policy)
- Predict game outcome (value)

Focus on:
- Loss functions (cross-entropy + value regression)
- Training stability
- Evaluating whether the model is learning meaningful patterns

---

### Phase 4: Monte Carlo Tree Search (MCTS)

**Goal:** Implement a search algorithm that improves model decisions.

Requirements:
- Tree structure over game states
- Selection using a UCB-style formula
- Expansion using model predictions
- Backpropagation of value estimates

Focus on:
- Understanding the exploration vs exploitation tradeoff
- How search refines model outputs

---

### Phase 5: Self-Play Loop

**Goal:** Generate training data by having the model play against itself.

Loop:
1. Use MCTS + model to play games
2. Store:
- states
- improved policies (from MCTS)
- final outcomes
3. Train model on this data

Focus on:
- Data generation quality
- Training feedback loops
- Instability and collapse modes

---

### Phase 6: Evaluation

**Goal:** Measure whether your system is improving.

Ideas:
- Play against:
- previous versions of the model
- random agent
- Track:
- win rates
- training loss trends

Optional:
- Compare against a chess engine (for sanity checks)

---

## Milestones

You should aim to reach:

- [ ] Board representation working
- [ ] Model produces valid outputs
- [ ] Supervised model predicts reasonable moves
- [ ] MCTS works independently
- [ ] Self-play generates valid games
- [ ] Training loop improves performance over time

---

## What to Pay Attention To

This project is less about finishing and more about observing:

- When does training fail?
- What does a "bad" model output look like?
- How sensitive is performance to:
- architecture?
- data quality?
- search depth?
- How does MCTS change behavior vs raw model predictions?

---

## Common Pitfalls

- Over-engineering early
- Using too large a model
- Skipping supervised pretraining
- Treating MCTS as a black box
- Ignoring debugging and logging

---

## Stretch Goals (Optional)

- Replace CNN with a transformer
- Experiment with different board encodings
- Add basic UI to play against your model
- Train on subsets of games (e.g., specific openings)
- Analyze learned strategies

---

## Final Notes

This project will likely:
- Break in unexpected ways
- Produce confusing results
- Require iteration and debugging

That's the point.

Focus on:
> Building something you deeply understand, even if it's imperfect.
35 changes: 35 additions & 0 deletions chessmate/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
[project]
name = "chessmate"
version = "0.1.0"
description = "AlphaZero-lite: Train a neural network to play chess"
requires-python = ">=3.10"
dependencies = [
"python-chess",
"torch",
"numpy",
"matplotlib",
"tqdm",
"mlflow",
]

[project.optional-dependencies]
dev = [
"ruff",
"pytest",
"pre-commit",
"mypy",
]

[tool.ruff]
line-length = 88

[tool.ruff.lint]
select = ["E", "F", "I"]

[tool.mypy]
python_version = "3.12"
warn_return_any = true
warn_unused_configs = true

[tool.pytest.ini_options]
testpaths = ["tests"]
6 changes: 6 additions & 0 deletions chessmate/src/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
def main():
print("chessmate")


if __name__ == "__main__":
main()
2 changes: 2 additions & 0 deletions chessmate/tests/test_placeholder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
def test_placeholder():
assert True
Loading
Loading