first version of training loop for dyna#5

Open

Misterhoonster wants to merge 32 commits intoKempnerInstitute:mainfrom

Misterhoonster:hoon-dyna

Contributor

Misterhoonster commented Apr 8, 2025

Could you take a look at my starter training loop logic?

Misterhoonster added 2 commits

April 5, 2025 23:51


          created dyna skeleton + setup for training loop

81c6ad0


          training loop v1

4e8ffab

wcarvalho reviewed

View reviewed changes

examples/crafter/dynax.py Outdated Show resolved Hide resolved

wcarvalho reviewed

View reviewed changes

examples/crafter/dynax.py Outdated Show resolved Hide resolved

wcarvalho reviewed

View reviewed changes

examples/crafter/dynax.py Show resolved Hide resolved

wcarvalho reviewed

View reviewed changes

examples/crafter/dynax.py Outdated Show resolved Hide resolved

wcarvalho reviewed

View reviewed changes

examples/crafter/dynax.py Outdated


		# --- Network Definition ---

		class RecurrentQNetwork(nn.Module):

Collaborator

wcarvalho Apr 11, 2025

Have a single Network with

setup
initialize_carry
call for when its used with read data
apply_model

Collaborator

wcarvalho Apr 11, 2025

call: takes in observation, applies rnn, returns state + predictions(q_vals)
apply_model: takes in state and action, applies model (for us environment, for somebody else a neural network), returns next_state + predictions(q_vals)

Collaborator

wcarvalho Apr 11, 2025

call(self, observation) --> [AgentState, Predictions]

apply_model(self, state, action) --> [AgentState, Predictions]

Collaborator

wcarvalho Apr 11, 2025

Think of this as the Agent

wcarvalho reviewed

View reviewed changes

examples/crafter/dynax.py Outdated Show resolved Hide resolved

wcarvalho reviewed

View reviewed changes

examples/crafter/dynax.py Outdated


		# --- Network Definition ---

		class RecurrentQNetwork(nn.Module):

Collaborator

wcarvalho Apr 11, 2025

call: takes in observation, applies rnn, returns state + predictions(q_vals)
apply_model: takes in state and action, applies model (for us environment, for somebody else a neural network), returns next_state + predictions(q_vals)

examples/crafter/dynax.py Outdated


		# --- Network Definition ---

		class RecurrentQNetwork(nn.Module):

Collaborator

wcarvalho Apr 11, 2025

call(self, observation) --> [AgentState, Predictions]

apply_model(self, state, action) --> [AgentState, Predictions]

examples/crafter/dynax.py Outdated


		# --- Network Definition ---

		class RecurrentQNetwork(nn.Module):

Collaborator

wcarvalho Apr 11, 2025

Think of this as the Agent

examples/crafter/dynax.py Outdated Show resolved Hide resolved

examples/crafter/dynax.py Outdated Show resolved Hide resolved

examples/crafter/dynax.py Outdated Show resolved Hide resolved

examples/crafter/dynax.py Outdated Show resolved Hide resolved

examples/crafter/dynax.py Outdated Show resolved Hide resolved

examples/crafter/dynax.py Outdated Show resolved Hide resolved

Misterhoonster added 6 commits

April 12, 2025 16:52


          removed runnerstate

2a49273


          change var names to runner_state

227f070


          created dyna agent

1b94a63


          renamed TimeStep.env_state to Timestep.state

ff2ff9c


          updated actor_step fn to work with new DynaAgent, added TimestepWrapper

abe5b4d


          autoreset false on crafter

bdab0e6

Contributor Author

Misterhoonster commented Apr 14, 2025

Hey Wilka, just updated my dyna file with your suggested changes!

Misterhoonster added 2 commits

April 23, 2025 01:18


          added simpolicy, finished total_loss fn

820b218


          finished loss class v1

ac9382e

Contributor Author

Misterhoonster commented Apr 25, 2025

Pushed the new version with the Loss class! Can you take a look, please?

Misterhoonster added 11 commits

May 4, 2025 17:44


          removed rolling windows

1f04915


          added encoder class

e0549b8


          integrated encoder

8344fde


          fixed simpolicy for actor and sim

d2a547c


          added rolling windows back with window_size=1

076b58c


          logging v1

4d9f455


          fixed optimizer + added incremental updates for target network

aa87b2a


          fixed naming and typing issues

ce77e16


          fix env.step syntax

c45b9cc


          added q_heads and MLP class to DynaAgent

144e778


          cleaned up logger; added gradient logging

1eb3749

Misterhoonster added 11 commits

May 6, 2025 23:09


          set up wandb

a83917f


          removed jaxneurorl import; added fns to file

73e0ff4


          first runnable version

6be7f65


          update dynaagent initialize carry

ffa896a


          bugs fixed up to and including simulate_n_trajectories

25eaca0


          bug fixes up to _learn_step

6f3f80a


          full code runs without errors

aa4fab2


          added reqs

370de3e


          tree_map to tree.map

bcd90f9


          fix timesteps counter

7efd25d


          changed reqs to cuda

e921bec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet