9caa24c58689c1d6f3d982f623ceab8f78e7362d,softlearning/algorithms/diayn.py,DIAYN,_init_critic_update,#DIAYN#,164
Before Change
Q-function update rule.
self._qf_t = self._qf.get_output_for(
self._obs_pl, self._action_pl, reuse=True) // N
(obs, z_one_hot) = self._split_obs()
if self._include_actions:
After Change
if self._include_actions:
logits = self._discriminator([obs, self._action_pl])
else:
logits = self._discriminator([obs])
reward_pl = -1 * tf.nn.softmax_cross_entropy_with_logits(labels=z_one_hot,
logits=logits)
reward_pl = tf.check_numerics(reward_pl, "Check numerics (1): reward_pl")
p_z = tf.reduce_sum(self._p_z_pl * z_one_hot, axis=1)
In pattern: SUPERPATTERN
Frequency: 3
Non-data size: 4
Instances
Project Name: rail-berkeley/softlearning
Commit Name: 9caa24c58689c1d6f3d982f623ceab8f78e7362d
Time: 2018-10-20
Author: hartikainen@berkeley.edu
File Name: softlearning/algorithms/diayn.py
Class Name: DIAYN
Method Name: _init_critic_update
Project Name: rail-berkeley/softlearning
Commit Name: 9caa24c58689c1d6f3d982f623ceab8f78e7362d
Time: 2018-10-20
Author: hartikainen@berkeley.edu
File Name: softlearning/algorithms/diayn.py
Class Name: DIAYN
Method Name: _init_discriminator_update
Project Name: rail-berkeley/softlearning
Commit Name: 9caa24c58689c1d6f3d982f623ceab8f78e7362d
Time: 2018-10-20
Author: hartikainen@berkeley.edu
File Name: softlearning/algorithms/diayn.py
Class Name: DIAYN
Method Name: _init_actor_update