a95eb8bd878b43346c4bb2e8e29911dc1ab90638,A2C.py,A2CContinuous,learn,#A2CContinuous#,201
Before Change
returns = np.concatenate([discount_rewards(trajectory["reward"], config["gamma"]) for trajectory in trajectories])
qw_new = self.get_critic_value(all_state)
self.sess.run([self.critic_train], feed_dict={self.critic_state_in: all_state, self.critic_target: returns})
target = np.mean((returns - qw_new) ** 2)
self.sess.run([self.actor_train], feed_dict={self.input_state: all_state, self.actions_taken: all_action, self.target: target})
episode_rewards = np.array([trajectory["reward"].sum() for trajectory in trajectories]) // episode total rewards
After Change
episode_rewards = np.array([trajectory["reward"].sum() for trajectory in trajectories]) // episode total rewards
episode_lengths = np.array([len(trajectory["reward"]) for trajectory in trajectories]) // episode lengths
results = self.sess.run([self.summary_op, self.critic_train, self.actor_train], feed_dict={
self.critic_state_in: all_state,
self.critic_target: returns,
self.input_state: all_state,
self.actions_taken: all_action,
self.critic_feedback: qw_new,
self.critic_rewards: returns,
self.rewards: np.mean(episode_rewards),
self.episode_lengths: np.mean(episode_lengths)
})
self.writer.add_summary(results[0], iteration)
self.writer.flush()
reporter.print_iteration_stats(iteration, episode_rewards, episode_lengths, total_n_trajectories)
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 3
Instances
Project Name: arnomoonens/yarll
Commit Name: a95eb8bd878b43346c4bb2e8e29911dc1ab90638
Time: 2017-02-07
Author: x-006@hotmail.com
File Name: A2C.py
Class Name: A2CContinuous
Method Name: learn
Project Name: arnomoonens/yarll
Commit Name: a95eb8bd878b43346c4bb2e8e29911dc1ab90638
Time: 2017-02-07
Author: x-006@hotmail.com
File Name: A2C.py
Class Name: A2C
Method Name: learn
Project Name: rlworkgroup/garage
Commit Name: d8c279e1573ab16fa4c9ba7a89a0cc5012f0caac
Time: 2019-03-28
Author: ryanjulian@users.noreply.github.com
File Name: garage/logger/tensor_board_output.py
Class Name: TensorBoardOutput
Method Name: _record_kv
Project Name: arnomoonens/yarll
Commit Name: a95eb8bd878b43346c4bb2e8e29911dc1ab90638
Time: 2017-02-07
Author: x-006@hotmail.com
File Name: A2C.py
Class Name: A2CContinuous
Method Name: learn