034ee51fb01fea37fb75e196a04cd96de102da28,torch_ac/algos/ppo.py,PPOAlgo,update_parameters,#PPOAlgo#,22
Before Change
preprocessed_obs = self.preprocess_obss(b.obs, use_gpu=torch.cuda.is_available())
dist = self.acmodel.get_dist(preprocessed_obs)
value = self.acmodel.get_value(preprocessed_obs)
entropy = dist.entropy()
ratio = torch.exp(dist.log_prob(b.action) - b.old_log_prob)
After Change
preprocessed_obs = self.preprocess_obss(b.obs, device=self.device)
if self.is_recurrent:
dist, value, _ = self.acmodel(preprocessed_obs, b.state * b.mask)
else:
dist, value = self.acmodel(preprocessed_obs)
In pattern: SUPERPATTERN
Frequency: 4
Non-data size: 4
Instances
Project Name: lcswillems/torch-rl
Commit Name: 034ee51fb01fea37fb75e196a04cd96de102da28
Time: 2018-04-30
Author: lcswillems@gmail.com
File Name: torch_ac/algos/ppo.py
Class Name: PPOAlgo
Method Name: update_parameters
Project Name: lcswillems/torch-rl
Commit Name: 034ee51fb01fea37fb75e196a04cd96de102da28
Time: 2018-04-30
Author: lcswillems@gmail.com
File Name: torch_ac/algos/a2c.py
Class Name: A2CAlgo
Method Name: update_parameters
Project Name: AIRLab-POLIMI/mushroom
Commit Name: 9290a1e0b09212ef35871aa59420923ac4d6860f
Time: 2017-10-23
Author: boris.ilpossente@hotmail.it
File Name: mushroom/policy/gaussian_policy.py
Class Name: GaussianPolicy
Method Name: _compute_prob
Project Name: AIRLab-POLIMI/mushroom
Commit Name: 9290a1e0b09212ef35871aa59420923ac4d6860f
Time: 2017-10-23
Author: boris.ilpossente@hotmail.it
File Name: mushroom/policy/gaussian_policy.py
Class Name: GaussianPolicy
Method Name: diff_log