Skip to content

aialt/hearthstone-gym

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Readme for Gymnasium-based Hearthstone Environment HearthGym

This repository contains a Gymnasium-based environment for the Hearthstone game. The environment is based on the fireplace package, which is a Python implementation of the Hearthstone game. The environment is designed to be used with reinforcement learning algorithms to train agents to play Hearthstone.

This repository requires Python 3.12.2. The repository is tested on Windows 11.

More documentation can be found via the website: HearthGym.

Setup

Before you can run the Hearthstone environment, you need to install the fireplace package. The fireplace package is a Python implementation of the Hearthstone game. The package is not available on PyPI, so you need to install it from the source. The package is included in this repository as a submodule. To install the package, run the following commands:

cd fireplace
pip install .

Check if the card data is included, this is normally in the git lfs of fireplace. If the data is not included, you can download it from the fireplace repository. The data is stored in the fireplace/data/cards folder.

Then you need to install the requirements for this repository. CAUTION: Some of the requirements are commented and written in the requirements.txt file and require specific actions to be installed. Afterwards, all normal requirements can be installed with the following command:

cd ..
pip install -r requirements.txt

Playing a Game

One can run a game in the Hearthstone environment with the run_game.py script, stored in the src/ folder. An example command would be:

python run_game.py \
    --agent1=0 \
    --agent2=0 \
    --class1=2 \
    --class2=2 \
    --games=1 \
    --log_file="logs/random_vs_random.log" \
    --intern_logging=False \
    --render=False \
    --seed=0

The run_game.py script has the following arguments:

  • agent1 and agent2: The agents that will play the game. The agents are defined in the agents folder. (Default: 0 (RandomAgent))
  • class1 and class2: The ids of the classes of the agents. The classes are defined in the fireplace package and the options are listed below. (Default: 2 (Hunter))
  • games: The number of games to play. (Default: 1)
  • log_file: The file where the logs will be saved. (Default: logs/game.log)
  • intern_logging: Whether to display the fireplace internal logs in the console. (Default: False)
  • render: Whether to render observations during the game. (Default: False)
  • seed: The seed for the random number generator. (Default: 42)

Running experiments

For experiments with HearthGym, an experiment workflow has been incorporated. The run_experiments.py script is used to run experiments with different agents. Documentation on how to run experiments can be found in the src/experiments/README.md file.

Analysis Dashboard

The src/analysis_dashboard folder contains a dashboard for analyzing the performance of Hearthstone agents. The dashboard is based on the Streamlit framework and is designed to be used with log files generated by the run_game.py script. The dashboard allows you to visualize the performance of different agents and compare their results. Documentation on how to run the dashboard can be found in the src/analysis_dashboard/README.md file.

Game Components

Agents

Currently, the following agents are implemented:

  • (0) RandomAgent: An agent that plays random actions.
  • (1) HumanAgent: An agent that takes actions based on your input.
  • (2) DynamicLookaheadAgent: An agent that uses a dynamic lookahead algorithm to choose the best action. (Based on Nils Bohnhof & Jon-Mailes Graeffe implementation)
  • (3) GretiveCompAgent: An agent that uses another scoring agent to choose the best action. (Based on Antonio M. Mora García implementation)
  • (4) WeightedScoreAgent: An agent that uses a weighted scoring algorithm to choose the best action. (Based on Sebastian Miller implementation)
  • (5) NaiveScoreLookaheadAgent: An agent that uses a naive scoring algorithm to choose the best action. (Based on Sebastian Miller implementation)
  • (6) PPOAgent: An agent that uses the PPO algorithm to choose the best action. (trained using the src/models/PPO framework, read the src/models/PPO/README.md file for more information)
  • (7) GreedyAgent: An agent that uses a greedy algorithm to choose the best action. (Based on the Hearthstone AI Competition implementation). Contains 3 heuristic scoring modes:
    • aggro: Aggressive mode, where the agent tries to deal as much damage as possible.
    • control: Control mode, where the agent tries to control the board and minimize damage taken.
    • ramp: Ramp mode, where the agent tries to ramp up its mana and play big minions as soon as possible.
  • (8) BaseDynamicLookaheadAgent: An agent that uses a dynamic lookahead algorithm to choose the best action. (Based on the Hearthstone AI Competition implementation). This agent is not used in the run_game.py script, but it is available for testing purposes.
  • (9) WorldModelAgent: An agent that uses a trained world model to choose the best action. (Trained using the src/models/WorldModel framework, read the src/models/WorldModel/README.md file for more information)
  • (10) EncodedPPOAgent: An agent that uses a trained auto encoder combined with the PPO algorithm to choose the best action. (Trained using the src/models/EncodedPPO framework, read the src/models/EncodedPPO/README.md file for more information)

Classes

The following classes are present in the Hearthstone environment. Some of the classes are disabled because they are not viable for 1v1 games.

  1. DeathKnight (Disabled)
  2. Druid
  3. Hunter
  4. Mage
  5. Paladin
  6. Priest
  7. Rogue
  8. Shaman
  9. Warlock
  10. Warrior
  11. Dream (Disabled due to not being real player class)
  12. Neutral (Disabled due to not being real player class)
  13. Whizbang (Disabled due to not being real player class)
  14. DemonHunter (Disabled)

Decks

Decks are stored and created in the src/data/ folder. A README file is present in the src/data/ folder that describes how to add new decks to the environment. HearthGym comes with eighteen pre-built decks, two for each enabled class. The metadata for the decks is stored in the src/data/decks_metadata.csv file and the actual cards in the decks are stored in the src/data/final_decks.csv file. The decks are based on the fireplace package and are designed to be used within HearthGym.

If an invalid class is provided, the game will choose a random class for the agent. (This is printed in the logs.)

Environment

The Hearthstone environment is implemented in the src/env/hearthstone/HearthGym.py file. The environment is based on the OpenAI Gym interface. It uses various components from the fireplace package to simulate the Hearthstone game.

The environment is a turn-based game where two agents play against each other. The environment is fully observable, and the agents can take actions based on the current state of the game.

State space

Default state space

The state space is a large observation space that includes the following components:

  • Player's mana: The amount of mana that the player has available.
  • Player's max mana: The maximum amount of mana that the player can have.
  • Player's hero health: The health of the player's hero.
  • Player's hero attack: The attack of the player's hero.
  • Player's hero armor: The armor of the player's hero.
  • Player's choice: 0 if the player has no choice, 1 if the player has a choice.
  • Opponent's hero health: The health of the opponent's hero.
  • Opponent's mana: The amount of mana that the opponent has available.
  • Opponent's max mana: The maximum amount of mana that the opponent can have.
  • Opponent's hand size: The number of cards in the opponent's hand.
  • Turn number: The current turn number.
  • Hero power availability: 0 if the hero power is not available, 1 if the hero power is available.
  • Hero weapon attack: The attack of the player's hero weapon.
  • Hero weapon durability: The durability of the player's hero weapon.
  • Player board minion stats: The stats of the minions on the player's board. Including attack/health, cost, and other stats (e.g., charge, taunt, divine shield).
  • Opponent board minion stats: The stats of the minions on the opponent's board. Including attack/health, cost, and other stats (e.g., charge, taunt, divine shield).

Optional - Embedded mode

The optional embedded mode adds three arrays to the state space:

  • Player's hand: The cards in the player's hand. This vector has 384 dimensions and is the average of the encodings of the card descriptions retrieved with the MiniLM-L6-v2 model using the sentence-transformers library. The encoding is done using the src\data\create_data.py file.

Optional - Deck Inclusion mode

The optional deck inclusion mode adds one array to the state space:

  • Player's deck: The cards in the player's deck. This vector has 384 dimensions and is the average of the encodings of the card descriptions retrieved with the MiniLM-L6-v2 model using the sentence-transformers library. The encoding is done using the src\data\create_data.py file. This mode adds the information of what cards the agent has in its deck.

Action space

The action space consists of six discrete values that define the actions that the agent can take:

  1. Action type: The type of action that the agent wants to take (if there is no choise required from the player). The options are:
    1. End turn
    2. Play card
    3. Use hero power
    4. Attack with minion
    5. Attack with hero
  2. Card index: Index of the card in the player's hand. If the action type is not Play card, this value is ignored.
  3. Attacker index: Index of the attacking minion in the player's field. If the action type is not Attack with minion, this value is ignored.
  4. Target index: Index of the target minion or hero. If the action type is not Attack with minion or Attack with hero, this value is ignored.
  5. Discover index: Index of the chosen card from a Discover effect. If there is no choice required from the player, this value is ignored.
  6. Choose index: Index of the chosen card from a choice effect (e.g., Discover, Adapt). If there is no choice required from the player, this value is ignored.
    • This differ from the discover index by only being used when the choice is presented after playing a card.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages