-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of custom environment and agents #10
Comments
I will upload a multi-agent PPO example, You can refer to it . |
I have added new examples, you can find it in xingtian/examples/ma_cases/ppo_share_catch_pigs.yaml. |
I have several questions as follows: [1] Could you explain the differences between the setting [2] I tried using a custom environment and a custom agent in XingTian. There were 2 agents (i.e., multiagent) training with 1 environment and 1 learner. The custom environment accepts agent actions of [2a] When the training was run with
[2b] On the other hand, when the training was run with
[2c] What should I do to achieve two agents (i.e., multiagent) training with N environments and M learners, with the above custom environment and custom agent interfaces? [3] Refer to this portion of the code. xingtian/xt/framework/agent_group.py Lines 436 to 462 in 9dee512
Assume we are training 2 agents (i.e., multiagent) with 1 environment and 1 learner. When using api_type==standalone , each agent appears to be executed in the same environment instance for one full episode using separate threads via self.bot.do_multi_job(job_funcs, _paras) .
|
[1] "standalone" means the simulator provides an independent interface for each agent, "unified" means all agents share one interface like smarts |
[2a] You should convert oberservation to numpy type in the your agent module. |
[3] In the "standalone" mode, each agent is running in independent thread, whether they run synchronously depends on the environment, some environment will guarantee all agent running in the same time piont and some environments are completely asynchronous |
I am interested in using XingTian for multi-agent training with PPO algorithm in the SMARTS environment. An example to use SMARTS environment is available here.
Could you provide a detailed step-by-step instructions and an example on how to use XingTian with our own custom environment for multi-agent training?
The text was updated successfully, but these errors were encountered: