Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there some limitation with the dimensions of actions and observations? #27

Open
paapu88 opened this issue Dec 21, 2023 · 5 comments
Open
Assignees
Labels
enhancement New feature or request

Comments

@paapu88
Copy link

paapu88 commented Dec 21, 2023

Dear Developers,
I'm getting the following error when running the code below

pearl/neural_networks/common/value_networks.py", line 262, in get_q_values
x = torch.cat([state_batch, action_batch], dim=-1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Tensors must have same number of dimensions: got 4 and 2

Am I doing something stupid, or is there some limitation (for instance, so that dimension of the action and observation space must be the same?)
Terveisin, Markus

""" 
copy pasted from 
https://github.com/facebookresearch/Pearl?tab=readme-ov-file#quick-start

with small modifications for training,


"""


from pearl.pearl_agent import PearlAgent
from pearl.action_representation_modules.one_hot_action_representation_module import (
    OneHotActionTensorRepresentationModule,
)
from pearl.policy_learners.sequential_decision_making.deep_q_learning import (
    DeepQLearning,
)
from pearl.replay_buffers.sequential_decision_making.fifo_off_policy_replay_buffer import (
    FIFOOffPolicyReplayBuffer,
)
from pearl.utils.instantiations.environments.gym_environment import GymEnvironment
from pearl.action_representation_modules.identity_action_representation_module import (
    IdentityActionRepresentationModule,
)
from pearl.utils.functional_utils.train_and_eval.online_learning import online_learning

from time import sleep
import gym
from tqdm import tqdm
import torch
import matplotlib.pyplot as plt
import numpy as np

# env = GymEnvironment("highway-v0", render_mode="human")

# env = GymEnvironment("CartPole-v1", render_mode="human")
env = GymEnvironment("CarRacing-v2", render_mode="human", continuous=False)
observation, action_space = env.reset()
print(f"observation")
print(observation)
print(f"action_space")
attributes = dir(action_space)
print(attributes)
print(f"action dim: {action_space.action_dim}")
# print(f"actions: {action_space.actions}")

# sys.exit()

agent = PearlAgent(
    policy_learner=DeepQLearning(
        state_dim=9216,
        action_space=action_space,
        hidden_dims=[64, 64],
        training_rounds=20,
        action_representation_module=OneHotActionTensorRepresentationModule(
            max_number_actions=5
        ),
    ),
    replay_buffer=FIFOOffPolicyReplayBuffer(10_000),
)

# experiment code
number_of_steps = 10000
record_period = 1000

info = online_learning(
    agent=agent,
    env=env,
    number_of_steps=number_of_steps,
    print_every_x_steps=1000,
    record_period=record_period,
    learn_after_episode=True,
)
torch.save(info["return"], "CarRacing-DQN-return.pt")
plt.plot(record_period * np.arange(len(info["return"])), info["return"], label="DQN")
plt.legend()
plt.show()
@rodrigodesalvobraz
Copy link
Contributor

I'm looking into this and will get back to you.

@BillMatrix
Copy link

@paapu88 is your observation space an image or a video with your environment?

@paapu88
Copy link
Author

paapu88 commented Jan 12, 2024

@jb3618columbia
Copy link
Contributor

I think the error is because you are using a VanillaQValueNetwork which requires the state and the action to have the same dimension. For image inputs, you want to use the CNNQValueNetwork as the network type (we need to enable that for deep q learning).

@rodrigodesalvobraz rodrigodesalvobraz self-assigned this Jan 12, 2024
@rodrigodesalvobraz
Copy link
Contributor

We are going to implement a fix.

@rodrigodesalvobraz rodrigodesalvobraz removed their assignment Jan 29, 2024
@rodrigodesalvobraz rodrigodesalvobraz added the enhancement New feature or request label Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants