780dcd9fd372afa8524a6515eec6a4c90b1494c9,Reinforcement_learning_TUT/8_Actor_Critic_Advantage/AC_CartPole.py,Actor,__init__,#Actor#Any#Any#Any#,22

Before Change


            )

        with tf.name_scope("loss"):
            neg_log_prob = -tf.log(self.acts_prob[0, self.act_index])   // loss without advantage
            self.loss = tf.reduce_mean(neg_log_prob * self.advantage)  // advantage (TD_error) guided loss

        with tf.name_scope("train"):

After Change


        )

        with tf.variable_scope("squared_TD_error"):
            self.td_error = tf.reduce_mean(self.r + GAMMA * self.v_next - self.v)
            self.loss = tf.square(self.td_error)    // TD_error = (r+gamma*V_next) - V_eval
        with tf.variable_scope("train"):
            self.train_op = tf.train.AdamOptimizer(lr).minimize(self.loss)
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 5

Instances


Project Name: MorvanZhou/tutorials
Commit Name: 780dcd9fd372afa8524a6515eec6a4c90b1494c9
Time: 2017-03-09
Author: morvanzhou@gmail.com
File Name: Reinforcement_learning_TUT/8_Actor_Critic_Advantage/AC_CartPole.py
Class Name: Actor
Method Name: __init__


Project Name: reinforceio/tensorforce
Commit Name: 67f74e592427d15578eae688f677952d8bd98d3a
Time: 2020-04-25
Author: alexkuhnle@t-online.de
File Name: tensorforce/core/distributions/categorical.py
Class Name: Categorical
Method Name: tf_parametrize


Project Name: reinforceio/tensorforce
Commit Name: 98fe0142e39af4a9a2450ca3f3e48a53152f5091
Time: 2016-12-29
Author: k@ifricke.com
File Name: tensorforce/updater/deep_q_network.py
Class Name: DeepQNetwork
Method Name: create_training_operations