diff --git a/404.html b/404.html index cccb3980..05a01281 100644 --- a/404.html +++ b/404.html @@ -1 +1 @@ -
Woops. Looks like this page doesn't exist ¯\_(ツ)_/¯.
Go to homepage
Please note that the dependencies docker images needs to be public and will be publicly visible on the platform. Make sure you do not include in them any file you want to keep private.
Currently, we can only process Docker images built for amd64 CPU architecture. So, if you are using MacOS with M1 or M2 CPUs, you need to explicitly tell Docker to do that at build time as follows:1. Open Docker Desktop Dashboard / Preferences (cog icon) / Turn “Experimental Features” on & apply2. Create a new builder instance with docker buildx create --use3. Run docker buildx build --platform linux/amd64 --push -t <image-tag> .Note that:- If you can’t see an “Experimental Features” option, sign up for the Docker developer program- You have to push directly to a repository instead of doing it after build
docker buildx create --use
docker buildx build --platform linux/amd64 --push -t <image-tag> .
The basic process to submit an agent consists in the following steps:
In our DIAMBRA Agents repo we provide many examples, ranging from a trivial random agent to RL agents trained with state-of-the-art RL libraries.
In the subsections linked below, we guide you through this process, starting from the easiest use case and building upon it to show you how to leverage the most advanced features.
To get the feeling of how an agent submission works, you can leverage our pre-built agents. In DIAMBRA Agents repo, together with different source code examples, we also provide pre-built docker images (packages) for some of them.
For example, here you find the pre-built docker image for the random agent correspondent to this source code. As indicated by the python script settings, this random agent will play using a “Random” character in a random game.
Using this pre-built docker image you can easily perform your first submission ever on DIAMBRA platform, and appear in the official online leaderboard by simply typing in your preferred shell the following command:
diambra agent submit diambra/agent-random-1:main
If you want to specify the game on which to run the random agent, use the --gameId command line argument that our pre-built image accepts, when submitting the docker image as follows: diambra agent submit diambra/agent-random-1:main --gameId tektagt. Additional similar use cases are covered in the “Arguments and Commands” page.
--gameId
diambra agent submit diambra/agent-random-1:main --gameId tektagt
After running the command, you will receive a submission confirmation, its identification number as well as the url where to see the results, something similar to the following:
diambra agent submit diambra/agent-random-1:main 🖥️ (178) Agent submitted: https://diambra.ai/submission/178 -
By default, the submission will select the lowest difficulty level (Easy) of the three available (Easy, Medium, Hard). To change this, you can add the --submission.difficulty argument: diambra agent submit --submission.difficulty Medium diambra/agent-random-1:main
Easy
Medium
Hard
--submission.difficulty
diambra agent submit --submission.difficulty Medium diambra/agent-random-1:main
As shown here, it is possible to embed your agent files (i.e. scripts and weights) in the dependencies docker image and submit only that. Keep in mind that this image needs to be public and will be visible on the platform, so every user will be able to use it for his own submissions.
Replace username and repository_name.git#ref=branch_name with the appropriate values and your_gh_token with the GitHub token you saved earlier.
username
repository_name.git#ref=branch_name
your_gh_token
Note that, in this case, the dependencies image and command fields we discussed above are merged together and provided as values to the last argument --submission.set-command. Use the same order and change their values according to your specific use case.
image
command
--submission.set-command
Our competition platform allows you to submit your agents and compete with other coders around the globe in epic video games tournaments!
It features a public global leaderboard where users are ranked by the best score achieved by their agents in our different environments.
It also offers you the possibility to unlock cool achievements depending on the performances of your agent.
Submitted agents are evaluated and their episodes are streamed on our Twitch channel.
We aimed at making the submission process as smooth as possible, try it now! You find all the details in the sub-pages linked below.
Each time you submit an agent, it is run for one episode to be evaluated. Every submission will generate a score, used for leaderboard positioning, and will unlock achievements.
The score is a function of both the total cumulative reward and the submission difficulty you selected at submission time, which can be either “Easy”, “Medium” or “Hard”. Every game has a different difficulty level scale, so a specific mapping is applied and is represented by the following table:
The relation that links score with total cumulative reward and difficulty is shown in the picture below. When “Easy” is selected, the score is exactly equal to the total cumulative reward. When “Medium” (or “Hard”) is selected, the score is obtained multiplying the total cumulative reward by a weighting value that varies linearly with the total cumulative reward obtained. It is equal to 1 if you obtain the lowest possible total cumulative reward (i.e. same score as if “Easy” was selected), and it is equal to the ratio between the game difficulty level for “Medium” (or “Hard”) and the game difficulty level for “Easy” if you obtain the highest possible total cumulative reward.
So, for example, for Dead or Alive ++, the weighting values for “Medium” and “Hard” vary linearly between
$$ +Home > Competition Platform > Submission Evaluation
$$ \begin{equation} \begin{gathered} k_M = \left[1.0, \frac{3}{2} \right] = \left[1.0, 1.5 \right] \\ k_H = \left[1.0, \frac{4}{2} \right] = \left[1.0, 2.0 \right] \end{gathered} \end{equation} -$$
If you want to test your agent locally before submitting it for evaluation on the platform, you can use the specific feature provided by our command line interface. The pattern of the command is the very same used for submission, except that instead of the submit option you will use test.
submit
test
It can be used to make sure the agent behaves as expected, and to debug it in case it fails, without waiting for the online evaluation pipeline.
It works with both plain docker images as well as submission manifests with privately hosted files and secret tokens, using respectively, the following commands:
diambra agent test <docker image>
or
diambra agent test --submission.secret token=<my-secret token> --submission.manifest submission.yaml -
And a special thanks to @vjeantet for his work on docdock, a fork of hugo-theme-learn. v2.0.0 of this theme is inspired by his work.
doapp
doapp.zip
d95855c7d8596a90f0b8ca15725686567d767a9a3f93a8896b489a160e705c4e
DEAD OR ALIVE ++ [JAPAN]
dead-or-alive-japan
80781
wowroms
difficulty
None
int
characters*
str
tuple
outfits*
*: must be provided as tuples of two elements (for agent_0 and agent_1 respectively) when using the environments in two players mode.
agent_0
agent_1
frame
stage
timer
side
wins
character
health
... many more to come soon.
sfiii3n
tektagt
umk3
samsh5sp
kof98umh
mvsc
xmvsf
soulclbr
*Measured on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz, using step_ratio = 1 and frame_shape = (128, 128, 1)
step_ratio = 1
frame_shape = (128, 128, 1)
Game specific details provide useful information about each title. They are reported in every game-dedicated page, and summarized in the table below.
Whenever possible, games are released with all hidden/bonus characters unlocked.
For every released title, extensive testing has been carried out, making sure the 1P mode is playable with no bugs up until game end.
kof98umh.zip
beb7bdea87137832f5f6d731fd1abd0350c0cd6b6b2d57cab2bedbac24fe8d0a
The King Of Fighters '98: Ultimate Match HERO
allmyroms
fighting_style*
ultimate_style*
character_1
character_2
character_3
power_bar
special_attacks
bar_type
mvsc.zip
6f63627cc37c554f74e8bf07b21730fa7f85511c7d5d07449850be98dde91da8
marvel vs capcom clash of super heroes
marvel-vs.-capcom-clash-of-super-heroes-euro-980123
5511
health_1
health_2
active_character
super_bar
super_count
partner
partner_attacks
samsh5sp.zip
adf33d8a02f3d900b4aa95e62fb21d9278fb920b179665b12a489bd39a6c103d
SAMURAI SHODOWN V SPECIAL
samurai-shodown-v-special
100347
rage_on
rage_used
action_move
action_attack
weapon_lost
weapon_fight
rage_bar
weapon_bar
sfiii3n.zip
7239b5eb005488db22ace477501c574e9420c0ab70aeeb0795dfeb474284d416
STREET FIGHTER III 3RD STRIKE: FIGHT FOR THE FUTUR [JAPAN] (CLONE)
street-fighter-iii-3rd-strike-fight-for-the-futur-japan-clone
106255
super_art*
stun_bar
stunned
super_type
super_max
soulclbr.zip
a07a1a19995d582b56f2865783c5d7adb7acb9a6ad995a26fc7c4cfecd821817
soul calibur
soul-calibur
106959
tektagtac.zip
57be777eae0ee9e1c035a64da4c0e7cb7112259ccebe64e7e97029ac7f01b168
TEKKEN TAG TOURNAMENT [ASIA] (CLONE)
tekken-tag-tournament-asia-clone
108661
bar_status
umk3r10.zip
f48216ad82f78cb86e9c07d2507be347f904f4b5ae354a85ae7c34d969d265af
ULTIMATE MORTAL KOMBAT 3 (CLONE)
ultimate-mortal-kombat-3-clone
109574
tower
aggressor_bar
xmvsf.zip
833aa46af63a3ad87f69ce2bacd85a4445f35a50e3aff4f793f069b205b51c60
x-men vs street fighter
x-men-vs.-street-fighter-usa-961004
8769
With the following additional term at the denominator:
The normalization term at the denominator ensures that a round won with a perfect (i.e. without losing any health), generates always the same maximum total cumulative reward (for the round) accross all games, equal to $N_c/N_k$.
This section presents a detailed description of the examples that are provided with DIAMBRA Arena repository. They cover the most important use-cases, and can be used as templates and starting points to explore all the features of the software package.
These examples show how to leverage both single and two players modes, how to set up environment wrappers with all their options, how to record human expert demonstrations and how to load them to apply imitation learning.
Every example has a dedicated page that can be reached via the sidebar menu or the list below.
Source code for examples described in what follows can be found in the code repository, here.
It is possible to activate emulator native rendering while running environments (i.e. bringing up the emulator graphics window). The CLI provides a specific flag for this purpose, but currently this is supported only on Linux, while Windows and MacOS users have to configure a Xserver and link it to the environment container. The next tabs provide hints for each context.
On Linux, the CLI allows to render emulator natively on the host, the user only needs to add the -g flag to the run command, as follows:
-g
diambra run -g python diambra_arena_gist.py -
Activating emulator native rendering will open a GUI where the game executes. Currently, this feature is affected by a problem: the mouse cursor disappears and remains constrained inside such window. To re-aquire control of the OS Xserver, one can circle through the active windows using the key combination ALT+TAB and highlight a different one.
To run environments with native emulator GUI support on Windows, currently requires the user to setup a virtual XServer and connect it to the container. We cannot provide support for this use case at the moment, but we plan to implement this feature in the near future.
A virtual XServer that in our experience proved to be effective is VcXsrv Windows X Server.
To run environments with native emulator GUI support on MacOS, currently requires the user to setup a virtual XServer and connect it to the container. We cannot provide support for this use case at the moment, but we plan to implement this feature in the near future.
A virtual XServer that in our experience proved to be effective is XQuartz 2.7.8 coupled to socat that can be installed via brew install socat.
socat
brew install socat
More complex scripts can be built in similar ways, for example continuously performing user-defined combos moves, or adding some more complex choice mechanics. But this would still require to decide the tactics in advance, properly translating knowledge into code. A different approach would be to leverage reinforcement learning, so that the agent will improve leveraging its own experience.
An alternative approach to scripted agents is adopting reinforcement learning, and the following sections provide examples on how to do that with the most important libraries in the domain.
DIAMBRA Arena natively provides interfaces to SheepRL, Stable Baselines 3, and Ray RLlib, allowing to easily train models with them on our environments. Each library-dedicated page presents some basic and advanced examples.
DIAMBRA Arena provides a working interface with Stable Baselines 2 too, but it is deprecated and will be discontinued in the near future.
How to run it locally:
diambra run python agent.py --trainedModel /absolute/path/to/checkpoint/ --envSpaces /absolute/path/to/environment/spaces/descriptor/ -
diambra run python agent.py --cfgFile /absolute/path/to/config.yaml --trainedModel "model_name" -
and the configuration file to be used is the same that was used for training it, like the one reported in the previous paragraph.
Use of this functionality can be found in this example.
Implementation examples and templates can be found in the code repository, here.
By default, this wrapper acts before wrappers are applied. Thus, it will store the original, native, unprocessed observations (and reward and actions) as generated by the base environment. This guarantees a better generality and transferability of the generated dataset, but requires preprocessing at load time.
DIAMBRA Arena provides a simple dedicated class DiambraDataLoader demonstrating how to load and minimally process recorded episodes, it only requires the dataset folder path as input parameter and can be customized adding the additional processing operations required. The data loader class is created as follows:
DiambraDataLoader
from diambra.arena.utils.diambra_data_loader import DiambraDataLoader data_loader = DiambraDataLoader(dataset_path) -
Implementation of this class can be found in the code repository, here.
DIAMBRA Arena software package is subject to our Terms of Use. By using it, you accept them in full.
This project is an experiment that applies in real-time the style of famous paintings to popular fighting retro games, which are provided as Reinforcement Learning environments by DIAMBRA.
This section contains a collection of projects that have been developed using DIAMBRA.
If you want to add yours, you can fork the docs repo and submit a Pull Request or get in touch on our Discord server and send us the material.
Evaluate Large Language Models (LLM) quality by having them fight in realtime in Street Fighter III. Who is the best? OpenAI or MistralAI? Let them fight! Open source code and ranking.
Making LLMs fight in realtime assess their speed and their reasoning abilities. LLMs have to quickly assess their environment and take actions based on it.
LLMs are different from Reinforcement Learning (RL) models, that are based on maximizing a reward function. LLMs have a more prior knowledge about the concepts of fighting, video games, Street Fighter, available guides, etc. This is a different approach that can help advance how AI understand and act with their environment.
This project was made by teams from phospho and Quivr.
This project is a proof of concept that customizes Tencent AI TLeague, a framework for Multi-Agent Reinforcement Learning based on distributed competitive self-play, and applies it to DIAMBRA Environments.
In 2021, we organized the very first AI Tournament leveraging DIAMBRA. It has been organized in collaboration with Reinforcement Learning Zurich (RLZ), a community of researchers, data scientists and software engineers interested in applications of Reinforcement Learning and AI.
Participants trained an AI algorithm to effectively play Dead Or Alive ++. The three best algorithms participated in the final event and competed for the 1400 CHF prize pool!
Depending on the Operating System used, specific permissions may be needed in order to read the keyboard inputs.- On Windows, by default no specific permissions are needed. However, if you have some third-party security software you may need to white-list Python.- On Linux you need to add the user the input group: sudo usermod -aG input $USER- On Mac, it is possible you need to use the settings application to allow your program to access the input devices (see this reference).Official inputs python package reference guide can be found at this link
input
sudo usermod -aG input $USER
inputs
wrappers
list
WrapperClass
kwargs
wrappers_settings.wrappers = \ [[CustomWrapper1, {"setting1_1": value1_1,"setting1_2": value1_2}], \ [CustomWrapper2, {"setting2_1": value2_1,"setting2_2": value2_2}]] -
The custom wrappers are applied after all the activated built-in ones, if any.