ec849adaf4ceb42ed52ca142c839f627c34b9434,slm_lab/agent/algorithm/reinforce.py,Reinforce,calc_advantage,#Reinforce#Any#,158

Before Change


                rewards.insert(0, big_r)
            rewards = torch.Tensor(rewards)
            logger.debug3(f"Rewards: {rewards}")
            rewards = (rewards - rewards.mean()) / (rewards.std() + np.finfo(np.float32).eps)
            logger.debug3(f"Normalized rewards: {rewards}")
            advantage.append(rewards)
        advantage = torch.cat(advantage)

After Change


            T = len(epi_rewards)
            returns = np.empty(T, "float32")
            for t in reversed(range(T)):
                big_r = epi_rewards[t] + self.gamma * big_r
                returns[t] = big_r
            logger.debug3(f"Rewards: {returns}")
            returns = (returns - returns.mean()) / (returns.std() + 1e-08)
Italian Trulli
In pattern: SUPERPATTERN

Frequency: 3

Non-data size: 4

Instances


Project Name: kengz/SLM-Lab
Commit Name: ec849adaf4ceb42ed52ca142c839f627c34b9434
Time: 2018-05-21
Author: kengzwl@gmail.com
File Name: slm_lab/agent/algorithm/reinforce.py
Class Name: Reinforce
Method Name: calc_advantage


Project Name: chainer/chainercv
Commit Name: 21e48c87172b4511688c66d3703f89e42a9c3444
Time: 2017-07-05
Author: Hakuyume@users.noreply.github.com
File Name: chainercv/evaluations/eval_detection_voc.py
Class Name:
Method Name: calc_detection_voc_prec_rec


Project Name: drckf/paysage
Commit Name: e38f38b399b9d06e97d9de164092ef7c200d2d14
Time: 2017-03-12
Author: charleskennethfisher@gmail.com
File Name: paysage/backends/pytorch_backend/matrix.py
Class Name:
Method Name:


Project Name: kengz/SLM-Lab
Commit Name: ec849adaf4ceb42ed52ca142c839f627c34b9434
Time: 2018-05-21
Author: kengzwl@gmail.com
File Name: slm_lab/agent/algorithm/reinforce.py
Class Name: Reinforce
Method Name: calc_advantage